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Preface 


A  First  Course  in  Linear  Algebra  presents  an  introduction  to  the  fascinating  subject  of  linear  algebra  for 
students  who  have  a  reasonable  understanding  of  basic  algebra.  Major  topics  of  linear  algebra  are  pre¬ 
sented  in  detail,  with  proofs  of  important  theorems  provided.  Separate  sections  may  be  included  in  which 
proofs  are  examined  in  further  depth  and  in  general  these  can  be  excluded  without  loss  of  contrinuity. 
Where  possible,  applications  of  key  concepts  are  explored.  In  an  effort  to  assist  those  students  who  are 
interested  in  continuing  on  in  linear  algebra  connections  to  additional  topics  covered  in  advanced  courses 
are  introduced. 

Each  chapter  begins  with  a  list  of  desired  outcomes  which  a  student  should  be  able  to  achieve  upon 
completing  the  chapter.  Throughout  the  text,  examples  and  diagrams  are  given  to  reinforce  ideas  and 
provide  guidance  on  how  to  approach  various  problems.  Students  are  encouraged  to  work  through  the 
suggested  exercises  provided  at  the  end  of  each  section.  Selected  solutions  to  these  exercises  are  given  at 
the  end  of  the  text. 

As  this  is  an  open  text,  you  are  encouraged  to  interact  with  the  textbook  through  annotating,  revising, 
and  reusing  to  your  advantage. 
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1.  Systems  of  Equations 


1.1  Systems  of  Equations,  Geometry 


Outcomes 


A.  Relate  the  types  of  solution  sets  of  a  system  of  two  (three)  variables  to  the  intersections  of 
lines  in  a  plane  (the  intersection  of  planes  in  three  space) 


As  you  may  remember,  linear  equations  like  2x  +  3y  =  6  can  be  graphed  as  straight  lines  in  the  coordi¬ 
nate  plane.  We  say  that  this  equation  is  in  two  variables,  in  this  case  x  and  y.  Suppose  you  have  two  such 
equations,  each  of  which  can  be  graphed  as  a  straight  line,  and  consider  the  resulting  graph  of  two  lines. 
What  would  it  mean  if  there  exists  a  point  of  intersection  between  the  two  lines?  This  point,  which  lies  on 
both  graphs,  gives  x  and  y  values  for  which  both  equations  are  true.  In  other  words,  this  point  gives  the 
ordered  pair  (x,y)  that  satisfy  both  equations.  If  the  point  (x,y)  is  a  point  of  intersection,  we  say  that  (x,y) 
is  a  solution  to  the  two  equations.  In  linear  algebra,  we  often  are  concerned  with  finding  the  solution(s) 
to  a  system  of  equations,  if  such  solutions  exist.  First,  we  consider  graphical  representations  of  solutions 
and  later  we  will  consider  the  algebraic  methods  for  finding  solutions. 

When  looking  for  the  intersection  of  two  lines  in  a  graph,  several  situations  may  arise.  The  follow¬ 
ing  picture  demonstrates  the  possible  situations  when  considering  two  equations  (two  lines  in  the  graph) 
involving  two  variables. 


One  Solution  No  Solutions  Infinitely  Many  Solutions 


In  the  first  diagram,  there  is  a  unique  point  of  intersection,  which  means  that  there  is  only  one  (unique) 
solution  to  the  two  equations.  In  the  second,  there  are  no  points  of  intersection  and  no  solution.  When  no 
solution  exists,  this  means  that  the  two  lines  are  parallel  and  they  never  intersect.  The  third  situation  which 
can  occur,  as  demonstrated  in  diagram  three,  is  that  the  two  lines  are  really  the  same  line.  For  example, 
x  +  y  —  1  and  2x  +  2y  =  2  are  equations  which  when  graphed  yield  the  same  line.  In  this  case  there  are 
infinitely  many  points  which  are  solutions  of  these  two  equations,  as  every  ordered  pair  which  is  on  the 
graph  of  the  line  satisfies  both  equations.  When  considering  linear  systems  of  equations,  there  are  always 
three  types  of  solutions  possible;  exactly  one  (unique)  solution,  infinitely  many  solutions,  or  no  solution. 
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Solution.  Through  graphing  the  above  equations  and  identifying  the  point  of  intersection,  we  can  find  the 
solution(s).  Remember  that  we  must  have  either  one  solution,  infinitely  many,  or  no  solutions  at  all.  The 
following  graph  shows  the  two  equations,  as  well  as  the  intersection.  Remember,  the  point  of  intersection 
represents  the  solution  of  the  two  equations,  or  the  (x,y)  which  satisfy  both  equations.  In  this  case,  there 
is  one  point  of  intersection  at  (—1,4)  which  means  we  have  one  unique  solution,  x—  —  l,y  =  4. 


* 

In  the  above  example,  we  investigated  the  intersection  point  of  two  equations  in  two  variables,  x  and 
y.  Now  we  will  consider  the  graphical  solutions  of  three  equations  in  two  variables. 

Consider  a  system  of  three  equations  in  two  variables.  Again,  these  equations  can  be  graphed  as 
straight  lines  in  the  plane,  so  that  the  resulting  graph  contains  three  straight  lines.  Recall  the  three  possible 
types  of  solutions;  no  solution,  one  solution,  and  infinitely  many  solutions.  There  are  now  more  complex 
ways  of  achieving  these  situations,  due  to  the  presence  of  the  third  line.  For  example,  you  can  imagine 
the  case  of  three  intersecting  lines  having  no  common  point  of  intersection.  Perhaps  you  can  also  imagine 
three  intersecting  lines  which  do  intersect  at  a  single  point.  These  two  situations  are  illustrated  below. 


No  Solution 


One  Solution 
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Consider  the  first  picture  above.  While  all  three  lines  intersect  with  one  another,  there  is  no  common 
point  of  intersection  where  all  three  lines  meet  at  one  point.  Hence,  there  is  no  solution  to  the  three 
equations.  Remember,  a  solution  is  a  point  (x,y)  which  satisfies  all  three  equations.  In  the  case  of  the 
second  picture,  the  lines  intersect  at  a  common  point.  This  means  that  there  is  one  solution  to  the  three 
equations  whose  graphs  are  the  given  lines.  You  should  take  a  moment  now  to  draw  the  graph  of  a  system 
which  results  in  three  parallel  lines.  Next,  try  the  graph  of  three  identical  lines.  Which  type  of  solution  is 
represented  in  each  of  these  graphs? 

We  have  now  considered  the  graphical  solutions  of  systems  of  two  equations  in  two  variables,  as  well 
as  three  equations  in  two  variables.  However,  there  is  no  reason  to  limit  our  investigation  to  equations  in 
two  variables.  We  will  now  consider  equations  in  three  variables. 

You  may  recall  that  equations  in  three  variables,  such  as  2x  +  4y  —  5z  —  8,  form  a  plane.  Above,  we 
were  looking  for  intersections  of  lines  in  order  to  identify  any  possible  solutions.  When  graphically  solving 
systems  of  equations  in  three  variables,  we  look  for  intersections  of  planes.  These  points  of  intersection 
give  the  (jc,y,z)  that  satisfy  all  the  equations  in  the  system.  What  types  of  solutions  are  possible  when 
working  with  three  variables?  Consider  the  following  picture  involving  two  planes,  which  are  given  by 
two  equations  in  three  variables. 


Notice  how  these  two  planes  intersect  in  a  line.  This  means  that  the  points  (x,y,z)  on  this  line  satisfy 
both  equations  in  the  system.  Since  the  line  contains  infinitely  many  points,  this  system  has  infinitely 
many  solutions. 

It  could  also  happen  that  the  two  planes  fail  to  intersect.  However,  is  it  possible  to  have  two  planes 
intersect  at  a  single  point?  Take  a  moment  to  attempt  drawing  this  situation,  and  convince  yourself  that  it 
is  not  possible!  This  means  that  when  we  have  only  two  equations  in  three  variables,  there  is  no  way  to 
have  a  unique  solution!  Hence,  the  types  of  solutions  possible  for  two  equations  in  three  variables  are  no 
solution  or  infinitely  many  solutions. 

Now  imagine  adding  a  third  plane.  In  other  words,  consider  three  equations  in  three  variables.  What 
types  of  solutions  are  now  possible?  Consider  the  following  diagram. 


In  this  diagram,  there  is  no  point  which  lies  in  all  three  planes.  There  is  no  intersection  between  all 
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planes  so  there  is  no  solution.  The  picture  illustrates  the  situation  in  which  the  line  of  intersection  of  the 
new  plane  with  one  of  the  original  planes  forms  a  line  parallel  to  the  line  of  intersection  of  the  first  two 
planes.  However,  in  three  dimensions,  it  is  possible  for  two  lines  to  fail  to  intersect  even  though  they  are 
not  parallel.  Such  lines  are  called  skew  lines. 

Recall  that  when  working  with  two  equations  in  three  variables,  it  was  not  possible  to  have  a  unique 
solution.  Is  it  possible  when  considering  three  equations  in  three  variables?  In  fact,  it  is  possible,  and  we 
demonstrate  this  situation  in  the  following  picture. 


In  this  case,  the  three  planes  have  a  single  point  of  intersection.  Can  you  think  of  other  types  of 
solutions  possible?  Another  is  that  the  three  planes  could  intersect  in  a  line,  resulting  in  infinitely  many 
solutions,  as  in  the  following  diagram. 


We  have  now  seen  how  three  equations  in  three  variables  can  have  no  solution,  a  unique  solution,  or 
intersect  in  a  line  resulting  in  infinitely  many  solutions.  It  is  also  possible  that  the  three  equations  graph 
the  same  plane,  which  also  leads  to  infinitely  many  solutions. 

You  can  see  that  when  working  with  equations  in  three  variables,  there  are  many  more  ways  to  achieve 
the  different  types  of  solutions  than  when  working  with  two  variables.  It  may  prove  enlightening  to  spend 
time  imagining  (and  drawing)  many  possible  scenarios,  and  you  should  take  some  time  to  try  a  few. 

You  should  also  take  some  time  to  imagine  (and  draw)  graphs  of  systems  in  more  than  three  variables. 
Equations  like  x  +  y  —  2z  +  4w  =  8  with  more  than  three  variables  are  often  called  hyper-planes.  You  may 
soon  realize  that  it  is  tricky  to  draw  the  graphs  of  hyper-planes !  Through  the  tools  of  linear  algebra,  we 
can  algebraically  examine  these  types  of  systems  which  are  difficult  to  graph.  In  the  following  section,  we 
will  consider  these  algebraic  tools. 
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Exercises 


Exercise  1.1.1  Graphically,  find  the  point  (.ri,  vi)  which  lies  on  both  lines,  x  +  3 y  —  1  and  4 x  —  y  —  3. 
That  is,  graph  each  line  and  see  where  they  intersect. 

Exercise  1.1.2  Graphically,  find  the  point  of  intersection  of  the  two  lines  3 x  +  y  =  3  and  x  +  2y  —  1.  That 
is,  graph  each  line  and  see  where  they  intersect. 

Exercise  1.1.3  You  have  a  system  of  k  equations  in  two  variables,  k  >  2.  Explain  the  geometric  signifi¬ 
cance  of 

(a)  No  solution. 

( b)  A  unique  solution. 

(c)  An  infinite  number  of  solutions. 
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Outcomes 


A.  Use  elementary  operations  to  find  the  solution  to  a  linear  system  of  equations. 

B.  Find  the  row-echelon  form  and  reduced  row-echelon  form  of  a  matrix. 

C.  Determine  whether  a  system  of  linear  equations  has  no  solution,  a  unique  solution  or  an 
infinite  number  of  solutions  from  its  row-echelon  form. 

D.  Solve  a  system  of  equations  using  Gaussian  Elimination  and  Gauss-Jordan  Elimination. 

E.  Model  a  physical  system  with  linear  equations  and  then  solve. 

We  have  taken  an  in  depth  look  at  graphical  representations  of  systems  of  equations,  as  well  as  how  to 
find  possible  solutions  graphically.  Our  attention  now  turns  to  working  with  systems  algebraically. 
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The  relative  size  of  m  and  n  is  not  important  here.  Notice  that  we  have  allowed  c:/(/  and  bj  to  be  any 
real  number.  We  can  also  call  these  numbers  scalars  .  We  will  use  this  term  throughout  the  text,  so  keep 
in  mind  that  the  term  scalar  just  means  that  we  are  working  with  real  numbers. 

Now,  suppose  we  have  a  system  where  bj  —  0  for  all  i.  In  other  words  every  equation  equals  0.  This  is 
a  special  type  of  system. 


Recall  from  the  previous  section  that  our  goal  when  working  with  systems  of  linear  equations  was  to 
find  the  point  of  intersection  of  the  equations  when  graphed.  In  other  words,  we  looked  for  the  solutions  to 
the  system.  We  now  wish  to  find  these  solutions  algebraically.  We  want  to  find  values  for  x\,  ■  ■  ■  ,xn  which 
solve  all  of  the  equations.  If  such  a  set  of  values  exists,  we  call  (jq,  •  •  •  ,xn )  the  solution  set. 

Recall  the  above  discussions  about  the  types  of  solutions  possible.  We  will  see  that  systems  of  linear 
equations  will  have  one  unique  solution,  infinitely  many  solutions,  or  no  solution.  Consider  the  following 
definition. 


Definition  1.4:  Consistent  and  Inconsistent  Systems 


A  system  of  linear  equations  is  called  consistent  if  there  exists  at  least  one  solution.  It  is  called 
inconsistent  if  there  is  no  solution. 
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If  you  think  of  each  equation  as  a  condition  which  must  be  satisfied  by  the  variables,  consistent  would 
mean  there  is  some  choice  of  variables  which  can  satisfy  all  the  conditions.  Inconsistent  would  mean  there 
is  no  choice  of  the  variables  which  can  satisfy  all  of  the  conditions. 

The  following  sections  provide  methods  for  determining  if  a  system  is  consistent  or  inconsistent,  and 
finding  solutions  if  they  exist. 

1.2.1.  Elementary  Operations 


We  begin  this  section  with  an  example.  Recall  from  Example  1.1  that  the  solution  to  the  given  system  was 
(x,y)  -  (-1,4). 


Solution.  By  graphing  these  two  equations  and  identifying  the  point  of  intersection,  we  previously  found 
that  (x,y)  =  (—1,4)  is  the  unique  solution. 

We  can  verify  algebraically  by  substituting  these  values  into  the  original  equations,  and  ensuring  that 
the  equations  hold.  First,  we  substitute  the  values  into  the  first  equation  and  check  that  it  equals  3. 

*  +  y  =(-!)  +  (4)  =  3 

This  equals  3  as  needed,  so  we  see  that  (—1,4)  is  a  solution  to  the  first  equation.  Substituting  the  values 
into  the  second  equation  yields 

y  —  x  =  (4)  —  (  —  1)  =4+1  =5 

which  is  true.  For  (x,y)  =  (—1,4)  each  equation  is  true  and  therefore,  this  is  a  solution  to  the  system.  4k 

Now,  the  interesting  question  is  this:  If  you  were  not  given  these  numbers  to  verify,  how  could  you 
algebraically  determine  the  solution?  Linear  algebra  gives  us  the  tools  needed  to  answer  this  question. 
The  following  basic  operations  are  important  tools  that  we  will  utilize. 


Definition  1.6:  Elementary  Operations 


Elementary  operations  are  those  operations  consisting  of  the  following. 

1.  Interchange  the  order  in  which  the  equations  are  listed. 

2.  Multiply  any  equation  by  a  nonzero  number. 

3.  Replace  any  equation  with  itself  added  to  a  multiple  of  another  equation. 


It  is  important  to  note  that  none  of  these  operations  will  change  the  set  of  solutions  of  the  system  of 
equations.  In  fact,  elementary  operations  are  the  key  tool  we  use  in  linear  algebra  to  find  solutions  to 
systems  of  equations. 


10  Systems  of  Equations 


Consider  the  following  example. 


r 

Example  1.7:  Effects  of  an  Elementary 

1 

Operation 

Show  that  the  system 

x  +  y  —  7 

oo 

1 

<3 

has  the  same  solution  as  the  system 

x  +  y  =  7 

— 3y  =  —6 

Solution.  Notice  that  the  second  system  has  been  obtained  by  taking  the  second  equation  of  the  first  system 
and  adding  -2  times  the  first  equation,  as  follows: 

2x-y+ (— 2)(x  +  y)  =  8  +  (— 2)(7) 


By  simplifying,  we  obtain 


— 3y  =  —6 


which  is  the  second  equation  in  the  second  system.  Now,  from  here  we  can  solve  for  y  and  see  that  y  =  2. 
Next,  we  substitute  this  value  into  the  first  equation  as  follows 


x  +  y —x+2—7 

Hence  x  =  5  and  so  (x,y)  =  (5,2)  is  a  solution  to  the  second  system.  We  want  to  check  if  (5,2)  is  also  a 
solution  to  the  first  system.  We  check  this  by  substituting  (x,y)  —  (5,2)  into  the  system  and  ensuring  the 
equations  are  true. 

x  +  y  =  (5)  +  (2)  =7 
2x  —  y  =  2  (5)  —  (2)  =  8 

Hence,  (5,2)  is  also  a  solution  to  the  first  system.  4 

This  example  illustrates  how  an  elementary  operation  applied  to  a  system  of  two  equations  in  two 
variables  does  not  affect  the  solution  set.  However,  a  linear  system  may  involve  many  equations  and  many 
variables  and  there  is  no  reason  to  limit  our  study  to  small  systems.  For  any  size  of  system  in  any  number 
of  variables,  the  solution  set  is  still  the  collection  of  solutions  to  the  equations.  In  every  case,  the  above 
operations  of  Definition  1.6  do  not  change  the  set  of  solutions  to  the  system  of  linear  equations. 

In  the  following  theorem,  we  use  the  notation  £)  to  represent  an  equation,  while  /?,  denotes  a  constant. 
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Before  we  proceed  with  the  proof  of  Theorem  1.8,  let  us  consider  this  theorem  in  context  of  Example 
1.7.  Then, 

E\  =x+y,  b\=l 
E2  =  2x-y,  b2  —  8 

Recall  the  elementary  operations  that  we  used  to  modify  the  system  in  the  solution  to  the  example.  First, 
we  added  (—2)  times  the  first  equation  to  the  second  equation.  In  terms  of  Theorem  1.8,  this  action  is 
given  by 

E2  +  (—2)  Ei  =  b2  +  (—2)  b\ 
or 

2x-y  +  (— 2)  (jc  +  y)  =  8 +  (-2)7 
This  gave  us  the  second  system  in  Example  1 .7,  given  by 

Ei=h 

E2  +  (—2)  E\  =b2  +  (—2)  b\ 

From  this  point,  we  were  able  to  find  the  solution  to  the  system.  Theorem  1.8  tells  us  that  the  solution 
we  found  is  in  fact  a  solution  to  the  original  system. 

We  will  now  prove  Theorem  1.8. 

Proof. 

1.  The  proof  that  the  systems  1.1  and  1.2  have  the  same  solution  set  is  as  follows.  Suppose  that 
(jci,  ■  •  •  ,xn)  is  a  solution  to  E\  =  b\,E2  —  b2.  We  want  to  show  that  this  is  a  solution  to  the  system 
in  1.2  above.  This  is  clear,  because  the  system  in  1.2  is  the  original  system,  but  listed  in  a  different 
order.  Changing  the  order  does  not  effect  the  solution  set,  so  (xi,  ■  ■  ■  ,x„)  is  a  solution  to  1.2. 
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2.  Next  we  want  to  prove  that  the  systems  1.1  and  1.3  have  the  same  solution  set.  That  is  E\  =  b\,E<i  = 
bi  has  the  same  solution  set  as  the  system  E\  =  b  \ ,  ££2  =  kb2  provided  k  ^  0.  Let  (q,  •  •  •  ,xn )  be  a 
solution  of£i  =b\,E2  =  b2,.  We  want  to  show  that  it  is  a  solution  to  E\  =  b\,kE2  =  kb2 ■  Notice  that 
the  only  difference  between  these  two  systems  is  that  the  second  involves  multiplying  the  equation, 
£2  =  bi  by  the  scalar  k.  Recall  that  when  you  multiply  both  sides  of  an  equation  by  the  same  number, 
the  sides  are  still  equal  to  each  other.  Hence  if  (jci,  -  •  •  ,xn)  is  a  solution  to  £2  =  £2,  then  it  will  also 
be  a  solution  to  ££2  =  ££2.  Hence,  (q,  ■  •  •  ,xn)  is  also  a  solution  to  1.3. 

Similarly,  let  (q,  •  •  •  ,xn)  be  a  solution  of  E\  —  Z?  1 , ££2  =  ££2-  Then  we  can  multiply  the  equation 
££2  =  ££2  by  the  scalar  l/£,  which  is  possible  only  because  we  have  required  that  £  ^  0.  Just  as 
above,  this  action  preserves  equality  and  we  obtain  the  equation  £2  =  £2-  Hence  (q,  •  •  •  ,x„)  is  also 
a  solution  to  E\  —  b\,E2  —  £2- 

3.  Finally,  we  will  prove  that  the  systems  1.1  and  1.4  have  the  same  solution  set.  We  will  show  that 
any  solution  of  £1  =  £i,£2  =  £2  is  also  a  solution  of  1.4.  Then,  we  will  show  that  any  solution  of 
1.4  is  also  a  solution  of  E\  =  £i,£2  =  £2-  Let  (q,  •  •  •  ,x„)  be  a  solution  to  E\  —  £1 , £2  =  £2-  Then 
in  particular  it  solves  E\  —  b\.  Hence,  it  solves  the  first  equation  in  1.4.  Similarly,  it  also  solves 
£2  =  £3-  By  our  proof  of  1.3,  it  also  solves  kE\  =  kb\ .  Notice  that  if  we  add  £2  and  ££1 ,  this  is  equal 
to  b2  +  kb\.  Therefore,  if  (q,  -  •  •  ,x„)  solves  E\  =  £i,£2  =  £2  it  must  also  solve£2  +  ££i  =  b2  +  kb\. 

Now  suppose  (q,  •  •  •  ,xn)  solves  the  system  E\  —  b\ ,£2  +  ££j  =  £2  +  kb\ .  Then  in  particular  it  is  a 
solution  of  E\  =b\.  Again  by  our  proof  of  1.3,  it  is  also  a  solution  to  ££]  =  kb\ .  Now  if  we  subtract 
these  equal  quantities  from  both  sides  of  £2  +££1  =  £2  +  kb  \  we  obtain  £2  =  £2,  which  shows  that 
the  solution  also  satisfies  E\  =  £i,£2  =  £2- 


4 

Stated  simply,  the  above  theorem  shows  that  the  elementary  operations  do  not  change  the  solution  set 
of  a  system  of  equations. 

We  will  now  look  at  an  example  of  a  system  of  three  equations  and  three  variables.  Similarly  to  the 
previous  examples,  the  goal  is  to  find  values  for  x,y,z  such  that  each  of  the  given  equations  are  satisfied 
when  these  values  are  substituted  in. 


Example  1.9:  Solving  a  System  of  Equations  with  Elementary  Operations 


Find  the  solutions  to  the  system, 

x  +  3y  +  6z  —  25 

2x  +  7y+  14z  =  58  (1.5) 

2y  +  5z=  19 


Solution.  We  can  relate  this  system  to  Theorem  1.8  above.  In  this  case,  we  have 

E\  —  x  +  3y  +  6z,  b\  —  25 

£2  =  2x  +  ly  +14  z,  £2  =  58 
£3  =  2y  +  5z,  b3  =  19 

Theorem  1.8  claims  that  if  we  do  elementary  operations  on  this  system,  we  will  not  change  the  solution 
set.  Therefore,  we  can  solve  this  system  using  the  elementary  operations  given  in  Definition  1 .6.  First, 
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replace  the  second  equation  by  (—2)  times  the  first  equation  added  to  the  second.  This  yields  the  system 

x  3y  T"  6z  —  25 

y+2z=8  (1.6) 

2y  +  5z  =  19 

Now,  replace  the  third  equation  with  (—2)  times  the  second  added  to  the  third.  This  yields  the  system 

x  +  3y  +  6z  — 25 

y+2z=8  (1.7) 

z  —  3 

At  this  point,  we  can  easily  find  the  solution.  Simply  take  z  —  3  and  substitute  this  back  into  the  previous 
equation  to  solve  for  y,  and  similarly  to  solve  for  x. 

x  +  3y  +  6  (3)  =  x  +  3y+18  =  25 
y  +  2(3)=y  +  6  =  8 
z  —  3 


The  second  equation  is  now 


y  +  6  =  8 


You  can  see  from  this  equation  that  y  =  2.  Therefore,  we  can  substitute  this  value  into  the  first  equation  as 
follows: 

x  +  3(2)  +  18  =  25 

By  simplifying  this  equation,  we  find  that  x  =  1.  Hence,  the  solution  to  this  system  is  (x,y,z)  —  (1,2,3). 
This  process  is  called  back  substitution. 

Alternatively,  in  1.7  you  could  have  continued  as  follows.  Add  (—2)  times  the  third  equation  to  the 
second  and  then  add  (—6)  times  the  second  to  the  first.  This  yields 


x  +  3y  =  7 

y  =  2 

z  —  3 


Now  add  (—3)  times  the  second  to  the  first.  This  yields 

x  =  1 

y  =  2 

z  —  3 

a  system  which  has  the  same  solution  set  as  the  original  system.  This  avoided  back  substitution  and  led 
to  the  same  solution  set.  It  is  your  decision  which  you  prefer  to  use,  as  both  methods  lead  to  the  correct 
solution,  (x,y,z)  —  (1,2,3).  4k 
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1.2.2.  Gaussian  Elimination 


The  work  we  did  in  the  previous  section  will  always  find  the  solution  to  the  system.  In  this  section,  we 
will  explore  a  less  cumbersome  way  to  find  the  solutions.  First,  we  will  represent  a  linear  system  with 
an  augmented  matrix.  A  matrix  is  simply  a  rectangular  array  of  numbers.  The  size  or  dimension  of  a 
matrix  is  defined  as  m  x  n  where  in  is  the  number  of  rows  and  n  is  the  number  of  columns.  In  order  to 
construct  an  augmented  matrix  from  a  linear  system,  we  create  a  coefficient  matrix  from  the  coefficients 
of  the  variables  in  the  system,  as  well  as  a  constant  matrix  from  the  constants.  The  coefficients  from  one 
equation  of  the  system  create  one  row  of  the  augmented  matrix. 

For  example,  consider  the  linear  system  in  Example  1.9 

x  +  3y  +  6z  — 25 
2x  +  7y  +  14  z  —  58 
2y  +  5z  —  19 

This  system  can  be  written  as  an  augmented  matrix,  as  follows 


"  1 

3 

6 

25  ' 

2 

7 

14 

58 

0 

2 

5 

19 

Notice  that  it  has  exactly  the  same  information  as  the  original  system.  Here  it  is  understood  that  the 

i  1 

first  column  contains  the  coefficients  from  x  in  each  equation,  in  order, 


2 

0 


Similarly,  we  create  a 


column  from  the  coefficients  on  y  in  each  equation, 


3 

7 

2 


and  a  column  from  the  coefficients  on  z,  in  each 


equation, 


a  column 


6 
14 
5 

'or  each  variable.  Similarly,  for  a  system  of  less  than  three  variables,  we  simply  construct  a 


.  For  a  system  of  more  than  three  variables,  we  would  continue  in  this  way  constructing 


column  for  each  variable. 

Finally,  we  construct  a  column  from  the  constants  of  the  equations, 


25 

58 

19 


The  rows  of  the  augmented  matrix  correspond  to  the  equations  in  the  system.  For  example,  the  top 
row  in  the  augmented  matrix,  [  1  3  6  |  25  ]  corresponds  to  the  equation 


xJr3y  +  6z  —  25. 


Consider  the  following  definition. 
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Now,  consider  elementary  operations  in  the  context  of  the  augmented  matrix.  The  elementary  opera¬ 
tions  in  Definition  1.6  can  be  used  on  the  rows  just  as  we  used  them  on  equations  previously.  Changes  to 
a  system  of  equations  in  as  a  result  of  an  elementary  operation  are  equivalent  to  changes  in  the  augmented 
matrix  resulting  from  the  corresponding  row  operation.  Note  that  Theorem  1.8  implies  that  any  elementary 
row  operations  used  on  an  augmented  matrix  will  not  change  the  solution  to  the  corresponding  system  of 
equations.  We  now  formally  define  elementary  row  operations.  These  are  the  key  tool  we  will  use  to  find 
solutions  to  systems  of  equations. 


Definition  1.11:  Elementary  Row  Operations 


The  elementary  row  operations  (also  known  as  row  operations)  consist  of  the  following 

1.  Switch  two  rows. 

2.  Multiply  a  row  by  a  nonzero  number. 

3.  Replace  a  row  by  any  multiple  of  another  row  added  to  it. 


Recall  how  we  solved  Example  1.9.  We  can  do  the  exact  same  steps  as  above,  except  now  in  the 
context  of  an  augmented  matrix  and  using  row  operations.  The  augmented  matrix  of  this  system  is 


"  1 

3 

6 

25  ' 

2 

7 

14 

58 

0 

2 

5 

19 

Thus  the  first  step  in  solving  the  system  given  by  1.5  would  be  to  take  (—2)  times  the  first  row  of  the 
augmented  matrix  and  add  it  to  the  second  row, 


"  1 

3 

6 

25  ' 

0 

1 

2 

8 

0 

2 

5 

19 
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Note  how  this  corresponds  to  1.6.  Next  take  (—2)  times  the  second  row  and  add  to  the  third, 


'  1 

3 

6 

25  ' 

0 

1 

2 

8 

0 

0 

1 

3 

This  augmented  matrix  corresponds  to  the  system 

x  T  3y  T  6z  =  25 
y  +  2z  =  8 
z  =  3 

which  is  the  same  as  1.7.  By  back  substitution  you  obtain  the  solution  x  =  l,y  =  2,  and  z  =  3. 

Through  a  systematic  procedure  of  row  operations,  we  can  simplify  an  augmented  matrix  and  carry  it 
to  row-echelon  form  or  reduced  row-echelon  form,  which  we  define  next.  These  forms  are  used  to  find 
the  solutions  of  the  system  of  equations  corresponding  to  the  augmented  matrix. 

In  the  following  definitions,  the  term  leading  entry  refers  to  the  first  nonzero  entry  of  a  row  when 
scanning  the  row  from  left  to  right. 


We  also  consider  another  reduced  form  of  the  augmented  matrix  which  has  one  further  condition. 


Definition  1.13:  Reduced  Row-Echelon  Form 


An  augmented  matrix  is  in  reduced  row-echelon  form  if 

1 .  All  nonzero  rows  are  above  any  rows  of  zeros. 

2.  Each  leading  entry  of  a  row  is  in  a  column  to  the  right  of  the  leading  entries  of  any  rows  above 
it. 

3.  Each  leading  entry  of  a  row  is  equal  to  1 . 

4.  All  entries  in  a  column  above  and  below  a  leading  entry  are  zero. 


Notice  that  the  first  three  conditions  on  a  reduced  row-echelon  form  matrix  are  the  same  as  those  for 
row-echelon  form. 

Hence,  every  reduced  row-echelon  form  matrix  is  also  in  row-echelon  form.  The  converse  is  not 
necessarily  true;  we  cannot  assume  that  every  matrix  in  row-echelon  form  is  also  in  reduced  row-echelon 
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form.  However,  it  often  happens  that  the  row-echelon  form  is  sufficient  to  provide  information  about  the 
solution  of  a  system. 

The  following  examples  describe  matrices  in  these  various  forms.  As  an  exercise,  take  the  time  to 
carefully  verify  that  they  are  in  the  specified  form. 


Notice  that  we  could  apply  further  row  operations  to  these  matrices  to  carry  them  to  reduced  row- 
echelon  form.  Take  the  time  to  try  that  on  your  own.  Consider  the  following  matrices,  which  are  in 
reduced  row-echelon  form. 


One  way  in  which  the  row-echelon  form  of  a  matrix  is  useful  is  in  identifying  the  pivot  positions  and 
pivot  columns  of  the  matrix. 
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Definition  1.17 :  Pivot  Position  and  Pivot  Column 


A  pivot  position  in  a  matrix  is  the  location  of  a  leading  entry  in  the  row-echelon  formof  a  matrix. 
A  pivot  column  is  a  column  that  contains  a  pivot  position. 


For  example  consider  the  following. 


Solution.  The  row-echelon  form  of  this  matrix  is 


■  1 

2 

3 

4  ' 

0 

1 

2 

3 

2 

.  0 

0 

0 

0  . 

This  is  all  we  need  in  this  example,  but  note  that  this  matrix  is  not  in  reduced  row-echelon  form. 

In  order  to  identify  the  pivot  positions  in  the  original  matrix,  we  look  for  the  leading  entries  in  the 
row-echelon  form  of  the  matrix.  Here,  the  entry  in  the  first  row  and  first  column,  as  well  as  the  entry  in 
the  second  row  and  second  column  are  the  leading  entries.  Hence,  these  locations  are  the  pivot  positions. 
We  identify  the  pivot  positions  in  the  original  matrix,  as  in  the  following: 


rm 

2 

3 

4  ' 

3 

2 

1 

6 

4 

4 

4 

10  . 

Thus  the  pivot  columns  in  the  matrix  are  the  first  two  columns.  4k 

The  following  is  an  algorithm  for  carrying  a  matrix  to  row-echelon  form  and  reduced  row-echelon 
form.  You  may  wish  to  use  this  algorithm  to  carry  the  above  matrix  to  row-echelon  form  or  reduced 
row-echelon  form  yourself  for  practice. 
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Algorithm  1.19:  Reduced  Row-Echelon  Form  Algorithm 


This  algorithm  provides  a  method  for  using  row  operations  to  take  a  matrix  to  its  reduced  row- 
echelon  form.  We  begin  with  the  matrix  in  its  original  form. 

1.  Starting  from  the  left,  find  the  first  nonzero  column.  This  is  the  first  pivot  column,  and  the 
position  at  the  top  of  this  column  is  the  first  pivot  position.  Switch  rows  if  necessary  to  place 
a  nonzero  number  in  the  first  pivot  position. 

2.  Use  row  operations  to  make  the  entries  below  the  first  pivot  position  (in  the  first  pivot  column ) 
equal  to  zero. 

3.  Ignoring  the  row  containing  the  first  pivot  position,  repeat  steps  1  and  2  with  the  remaining 
rows.  Repeat  the  process  until  there  are  no  more  rows  to  modify. 

4.  Divide  each  nonzero  row  by  the  value  of  the  leading  entry,  so  that  the  leading  entry  becomes 
1 .  The  matrix  will  then  be  in  row-echelon  form. 

The  following  step  will  carry  the  matrix  from  row-echelon  form  to  reduced  row-echelon  form. 

5.  Moving  from  right  to  left,  use  row  operations  to  create  zeros  in  the  entries  of  the  pivot  columns 
which  are  above  the  pivot  positions.  The  result  will  be  a  matrix  in  reduced  row-echelon  form. 


Most  often  we  will  apply  this  algorithm  to  an  augmented  matrix  in  order  to  find  the  solution  to  a  system 
of  linear  equations.  However,  we  can  use  this  algorithm  to  compute  the  reduced  row-echelon  form  of  any 
matrix  which  could  be  useful  in  other  applications. 

Consider  the  following  example  of  Algorithm  1.19. 


Example  1.20:  Finding  Row-Echelon  Form  and 

Reduced  Row-Echelon  Form  of  a  Matrix 

Let 

' 0  -5  -4' 

A  = 

1  4  3 

5  10  7 

Find  the  row-echelon  form  of  A.  Then  complete  the  process  until  A  is  in  reduced  row-echelon  form. 

Solution.  In  working  through  this  example,  we  will  use  the  steps  outlined  in  Algorithm  1.19. 

1.  The  first  pivot  column  is  the  first  column  of  the  matrix,  as  this  is  the  first  nonzero  column  from  the 
left.  Hence  the  first  pivot  position  is  the  one  in  the  first  row  and  first  column.  Switch  the  first  two 
rows  to  obtain  a  nonzero  entry  in  the  first  pivot  position,  outlined  in  a  box  below. 

'0  4  3' 

0  -5  -4 
5  10  7 
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2.  Step  two  involves  creating  zeros  in  the  entries  below  the  first  pivot  position.  The  first  entry  of  the 
second  row  is  already  a  zero.  All  we  need  to  do  is  subtract  5  times  the  first  row  from  the  third  row. 
The  resulting  matrix  is 

"14  3' 

0  -5  -4 
_  0  i°  8  _ 

3.  Now  ignore  the  top  row.  Apply  steps  1  and  2  to  the  smaller  matrix 

"  -5  -4  " 

10  8 

In  this  matrix,  the  first  column  is  a  pivot  column,  and  —5  is  in  the  first  pivot  position.  Therefore,  we 
need  to  create  a  zero  below  it.  To  do  this,  add  2  times  the  first  row  (of  this  matrix)  to  the  second. 
The  resulting  matrix  is 

"  -5  -4  " 

0  0 

Our  original  matrix  now  looks  like 

"1  4  3  ' 

0  -5  -4 
_  0  0  0  _ 

We  can  see  that  there  are  no  more  rows  to  modify. 

4.  Now,  we  need  to  create  leading  Is  in  each  row.  The  first  row  already  has  a  leading  1  so  no  work  is 
needed  here.  Divide  the  second  row  by  —5  to  create  a  leading  1.  The  resulting  matrix  is 

'14  3' 

0  1  I 

.  0  0  0  . 

This  matrix  is  now  in  row-echelon  form. 

5.  Now  create  zeros  in  the  entries  above  pivot  positions  in  each  column,  in  order  to  carry  this  matrix 
all  the  way  to  reduced  row-echelon  form.  Notice  that  there  is  no  pivot  position  in  the  third  column 
so  we  do  not  need  to  create  any  zeros  in  this  column!  The  column  in  which  we  need  to  create  zeros 
is  the  second.  To  do  so,  subtract  4  times  the  second  row  from  the  first  row.  The  resulting  matrix  is 

■i  o  -r 

0  i  i 

0  0  0  _ 

This  matrix  is  now  in  reduced  row-echelon  form.  4k 

The  above  algorithm  gives  you  a  simple  way  to  obtain  the  row-echelon  form  and  reduced  row-echelon 
form  of  a  matrix.  The  main  idea  is  to  do  row  operations  in  such  a  way  as  to  end  up  with  a  matrix  in 
row-echelon  form  or  reduced  row-echelon  form.  This  process  is  important  because  the  resulting  matrix 
will  allow  you  to  describe  the  solutions  to  the  corresponding  linear  system  of  equations  in  a  meaningful 
way. 
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In  the  next  example,  we  look  at  how  to  solve  a  system  of  equations  using  the  corresponding  augmented 
matrix. 


Solution.  The  augmented  matrix  for  this  system  is 


"  2 

4 

-3 

-1  ' 

5 

10 

-7 

-2 

3 

6 

5 

9 

In  order  to  find  the  solution  to  this  system,  we  wish  to  carry  the  augmented  matrix  to  reduced  row- 
echelon  form.  We  will  do  so  using  Algorithm  1.19.  Notice  that  the  first  column  is  nonzero,  so  this  is  our 
first  pivot  column.  The  first  entry  in  the  first  row,  2,  is  the  first  leading  entry  and  it  is  in  the  first  pivot 
position.  We  will  use  row  operations  to  create  zeros  in  the  entries  below  the  2.  First,  replace  the  second 
row  with  —5  times  the  first  row  plus  2  times  the  second  row.  This  yields 


'  2 

4 

-3 

-1  ' 

0 

0 

1 

1 

3 

6 

5 

9 

Now,  replace  the  third  row  with  —3  times  the  first  row  plus  to  2  times  the  third  row.  This  yields 


'  2 

4 

-3 

-1  ' 

0 

0 

1 

1 

0 

0 

1 

21 

Now  the  entries  in  the  first  column  below  the  pivot  position  are  zeros.  We  now  look  for  the  second 
column,  which  in  this  case  is  column  three.  Here,  the  1  in  the  second  row  and  third  column  is  in  the 
position.  We  need  to  do  just  one  row  operation  to  create  a  zero  below  the  1. 

Taking  —  1  times  the  second  row  and  adding  it  to  the  third  row  yields 


'  2 

4 

-3 

-1  ' 

0 

0 

1 

1 

0 

0 

0 

20 

pivot 

pivot 


We  could  proceed  with  the  algorithm  to  carry  this  matrix  to  row-echelon  form  or  reduced  row-echelon 
form.  However,  remember  that  we  are  looking  for  the  solutions  to  the  system  of  equations.  Take  another 
look  at  the  third  row  of  the  matrix.  Notice  that  it  corresponds  to  the  equation 


O.r  +  Oy  +  Oz  =  20 
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There  is  no  solution  to  this  equation  because  for  all  x,y,z,  the  left  side  will  equal  0  and  0  ^  20.  This  shows 
there  is  no  solution  to  the  given  system  of  equations.  In  other  words,  this  system  is  inconsistent.  4k 

The  following  is  another  example  of  how  to  find  the  solution  to  a  system  of  equations  by  carrying  the 
corresponding  augmented  matrix  to  reduced  row-echelon  form. 


Solution.  The  augmented  matrix  of  this  system  is 


in 

1 

7 

CO 

9  " 

0  1  -10 

0 

-2  1  0 

i 

VO 

1 

In  order  to  find  the  solution  to  this  system,  we  will  carry  the  augmented  matrix  to  reduced  row-echelon 
form,  using  Algorithm  1.19.  The  first  column  is  the  first  pivot  column.  We  want  to  use  row  operations  to 
create  zeros  beneath  the  first  entry  in  this  column,  which  is  in  the  first  pivot  position.  Replace  the  third 
row  with  2  times  the  first  row  added  to  3  times  the  third  row.  This  gives 


in 

1 

7 

CO 

9  ' 

0  1  -10 

0 

_  0  1-10 

- 1 

O 

Now,  we  have  created  zeros  beneath  the  3  in  the  first  column,  so  we  move  on  to  the  second  pivot  column 
(which  is  the  second  column)  and  repeat  the  procedure.  Take  —1  times  the  second  row  and  add  to  the  third 
row. 


'  3 

-1 

-5 

9  ' 

0 

1 

-10 

0 

0 

0 

0 

0 

The  entry  below  the  pivot  position  in  the  second  column  is  now  a  zero.  Notice  that  we  have  no  more  pivot 
columns  because  we  have  only  two  leading  entries. 

At  this  stage,  we  also  want  the  leading  entries  to  be  equal  to  one.  To  do  so,  divide  the  first  row  by  3. 


'  1 

l 

3 

5 

3 

3  ' 

0 

1 

-10 

0 

_  0 

0 

0 

0  _ 

This  matrix  is  now  in  row-echelon  form. 

Let’s  continue  with  row  operations  until  the  matrix  is  in  reduced  row-echelon  form.  This  involves 
creating  zeros  above  the  pivot  positions  in  each  pivot  column.  This  requires  only  one  step,  which  is  to  add 
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|  times  the  second  row  to  the  first  row. 


"  1 

0 

-5 

3  " 

0 

1 

-10 

0 

0 

0 

0 

0 

This  is  in  reduced  row-echelon  form,  which  you  should  verify  using  Definition  1.13.  The  equations 
corresponding  to  this  reduced  row-echelon  form  are 

x  —  5z  —  3 
y  —  10  z  =  0 

or 

x  =  3  +  5z 
y  =  lOz 

Observe  that  z  is  not  restrained  by  any  equation.  In  fact,  z  can  equal  any  number.  For  example,  we  can 
let  z  —  t,  where  we  can  choose  t  to  be  any  number.  In  this  context  t  is  called  a  parameter  .  Therefore,  the 
solution  set  of  this  system  is 

x  =  3  +  5 1 
y  —  10/ 
z  =  t 

where  t  is  arbitrary.  The  system  has  an  infinite  set  of  solutions  which  are  given  by  these  equations.  For 
any  value  of  t  we  select,  x,y,  and  z  will  be  given  by  the  above  equations.  For  example,  if  we  choose  t  =  4 
then  the  corresponding  solution  would  be 


x  =  3  +  5(4)  =23 
y  =  10(4)  =  40 
z  —  4 


* 

In  Example  1.22  the  solution  involved  one  parameter.  It  may  happen  that  the  solution  to  a  system 
involves  more  than  one  parameter,  as  shown  in  the  following  example. 


— 

Example  1.23:  A  Two  Parameter 

- 1 

Set  of  Solutions 

Find  the  solution  to  the  system 

x  +  2y  —  z  +  w —  3 

x  +  y-z  +  w  =  1 

x  +  3y  —  z  +  w  —  5 

Solution.  The  augmented  matrix  is 


'  1 

2 

-1 

1 

3  ' 

1 

1 

-1 

1 

1 

1 

3 

-1 

1 

5 

We  wish  to  carry  this  matrix  to  row-echelon  form.  Here,  we  will  outline  the  row  operations  used.  However, 
make  sure  that  you  understand  the  steps  in  terms  of  Algorithm  1.19. 
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Take  —  1  times  the  first  row  and  add  to  the  second.  Then  take  —  1  times  the  first  row  and  add  to  the 
third.  This  yields 


"  1 

2 

-1 

1 

3  " 

0 

-1 

0 

0 

-2 

0 

1 

0 

0 

2 

Now  add  the  second  row  to  the  third  row  and  divide  the  second  row  by  —  1. 


"  1 

2 

-1 

1 

3  ' 

0 

1 

0 

0 

2 

0 

0 

0 

0 

0 

This  matrix  is  in  row-echelon  form  and  we  can  see  that  x  and  y  correspond  to  pivot  columns,  while 
z  and  w  do  not.  Therefore,  we  will  assign  parameters  to  the  variables  z  and  w.  Assign  the  parameter  s 
to  z  and  the  parameter  t  to  w.  Then  the  first  row  yields  the  equation  x  +  2 y  —  s  +  t  —  3,  while  the  second 
row  yields  the  equation  y  =  2.  Since  y  —  2,  the  first  equation  becomes  x  +  4  —  s  +  t  —  3  showing  that  the 
solution  is  given  by 

x  =  —  1  +s  —  t 

y  =  2 

z  —  s 

W  —  t 

It  is  customary  to  write  this  solution  in  the  form 


X 

1 - 

1 

+ 

7 

y 

2 

z 

s 

w 

t 

(1.10) 


This  example  shows  a  system  of  equations  with  an  infinite  solution  set  which  depends  on  two  param¬ 
eters.  It  can  be  less  confusing  in  the  case  of  an  infinite  solution  set  to  first  place  the  augmented  matrix  in 
reduced  row-echelon  form  rather  than  just  row-echelon  form  before  seeking  to  write  down  the  description 
of  the  solution. 

In  the  above  steps,  this  means  we  don’t  stop  with  the  row-echelon  form  in  equation  1.9.  Instead  we 
first  place  it  in  reduced  row-echelon  form  as  follows. 


"  1 

0 

-1 

1 

-1 ' 

0 

1 

0 

0 

2 

0 

0 

0 

0 

0 

Then  the  solution  is  y  —  2  from  the  second  row  and  x  —  —l+z—w  from  the  first.  Thus  letting  z  —  s  and 
w  =  t,  the  solution  is  given  by  1.10. 

You  can  see  here  that  there  are  two  paths  to  the  correct  answer,  which  both  yield  the  same  answer. 
Hence,  either  approach  may  be  used.  The  process  which  we  first  used  in  the  above  solution  is  called 
Gaussian  Elimination  This  process  involves  carrying  the  matrix  to  row-echelon  form,  converting  back  to 
equations,  and  using  back  substitution  to  find  the  solution.  When  you  do  row  operations  until  you  obtain 
reduced  row-echelon  form,  the  process  is  called  Gauss- Jordan  Elimination. 
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We  have  now  found  solutions  for  systems  of  equations  with  no  solution  and  infinitely  many  solutions, 
with  one  parameter  as  well  as  two  parameters.  Recall  the  three  types  of  solution  sets  which  we  discussed 
in  the  previous  section;  no  solution,  one  solution,  and  infinitely  many  solutions.  Each  of  these  types  of 
solutions  could  be  identified  from  the  graph  of  the  system.  It  turns  out  that  we  can  also  identify  the  type 
of  solution  from  the  reduced  row-echelon  form  of  the  augmented  matrix. 

•  No  Solution:  In  the  case  where  the  system  of  equations  has  no  solution,  the  row-echelon  form  of 
the  augmented  matrix  will  have  a  row  of  the  form 

[  0  0  0  I  1  ] 

This  row  indicates  that  the  system  is  inconsistent  and  has  no  solution. 

•  One  Solution:  In  the  case  where  the  system  of  equations  has  one  solution,  every  column  of  the 
coefficient  matrix  is  a  pivot  column.  The  following  is  an  example  of  an  augmented  matrix  in  reduced 
row-echelon  form  for  a  system  of  equations  with  one  solution. 


'  1 

0 

0 

5  ' 

0 

1 

0 

0 

0 

0 

1 

2 

•  Infinitely  Many  Solutions:  In  the  case  where  the  system  of  equations  has  infinitely  many  solutions, 
the  solution  contains  parameters.  There  will  be  columns  of  the  coefficient  matrix  which  are  not 
pivot  columns.  The  following  are  examples  of  augmented  matrices  in  reduced  row-echelon  form  for 
systems  of  equations  with  infinitely  many  solutions. 


"  1 

0 

0 

5  ' 

0 

1 

2 

-3 

_  0 

0 

0 

0  _ 

'  1 

0 

0 

5  ' 

0 

1 

0 

-3 

1.2.3.  Uniqueness  of  the  Reduced  Row-Echelon  Form 


As  we  have  seen  in  earlier  sections,  we  know  that  every  matrix  can  be  brought  into  reduced  row-echelon 
form  by  a  sequence  of  elementary  row  operations.  Here  we  will  prove  that  the  resulting  matrix  is  unique; 
in  other  words,  the  resulting  matrix  in  reduced  row-echelon  form  does  not  depend  upon  the  particular 
sequence  of  elementary  row  operations  or  the  order  in  which  they  were  performed. 

Let  A  be  the  augmented  matrix  of  a  homogeneous  system  of  linear  equations  in  the  variables  xi,X2,  •  •  • ,  xn 
which  is  also  in  reduced  row-echelon  form.  The  matrix  A  divides  the  set  of  variables  in  two  different  types. 
We  say  that  x,-  is  a  basic  variable  whenever  A  has  a  leading  1  in  column  number  i,  in  other  words,  when 
column  i  is  a  pivot  column.  Otherwise  we  say  that  x,  is  a  free  variable. 

Recall  Example  1.23. 
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Solution.  Recall  from  the  solution  of  Example  1.23  that  the  row-echelon  form  of  the  augmented  matrix  of 
this  system  is  given  by 


"  1 

2 

-1 

1 

3  ' 

0 

1 

0 

0 

2 

0 

0 

0 

0 

0 

You  can  see  that  columns  1  and  2  are  pivot  columns.  These  columns  correspond  to  variables  x  and  y, 
making  these  the  basic  variables.  Columns  3  and  4  are  not  pivot  columns,  which  means  that  z  and  w  are 
free  variables. 

We  can  write  the  solution  to  this  system  as 

x  —  —  1 +s  —  t 

y  =  2 

Z  —  s 
w  —  t 

Here  the  free  variables  are  written  as  parameters,  and  the  basic  variables  are  given  by  linear  functions 
of  these  parameters.  4 

In  general,  all  solutions  can  be  written  in  terms  of  the  free  variables.  In  such  a  description,  the  free 
variables  can  take  any  values  (they  become  parameters),  while  the  basic  variables  become  simple  linear 
functions  of  these  parameters.  Indeed,  a  basic  variable  x,  is  a  linear  function  of  only  those  free  variables 
Xj  with  j  >  i.  This  leads  to  the  following  observation. 


Proposition  1.25:  Basic  and  Free  Variables 


Ifxt  is  a  basic  variable  of  a  homogeneous  system  of  linear  equations,  then  any  solution  of  the  system 
with  xj  =  0  for  all  those  free  variables  xj  with  j  >  i  must  also  have  x,  —  0. 


Using  this  proposition,  we  prove  a  lemma  which  will  be  used  in  the  proof  of  the  main  result  of  this 
section  below. 


Lemma  1.26:  Solutions  and  the  Reduced  Row-Echelon  Form  of  a  Matrix 


Let  A  and  B  be  two  distinct  augmented  matrices  for  two  homogeneous  systems  of  m  equations  in  n 
variables,  such  that  A  and  B  are  each  in  reduced  row-echelon  form.  Then,  the  two  systems  do  not 
have  exactly  the  same  solutions. 


Proof.  With  respect  to  the  linear  systems  associated  with  the  matrices  A  and  B,  there  are  two  cases  to 
consider: 
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•  Case  1 :  the  two  systems  have  the  same  basic  variables 

•  Case  2:  the  two  systems  do  not  have  the  same  basic  variables 

In  case  1,  the  two  matrices  will  have  exactly  the  same  pivot  positions.  However,  since  A  and  B  are  not 
identical,  there  is  some  row  of  A  which  is  different  from  the  corresponding  row  of  B  and  yet  the  rows  each 
have  a  pivot  in  the  same  column  position.  Let  i  be  the  index  of  this  column  position.  Since  the  matrices  are 
in  reduced  row-echelon  form,  the  two  rows  must  differ  at  some  entry  in  a  column  j  >  i.  Let  these  entries 
be  a  in  A  and  b  in  B,  where  b.  Since  A  is  in  reduced  row-echelon  form,  if  Xj  were  a  basic  variable 
for  its  linear  system,  we  would  have  a  —  0.  Similarly,  if  Xj  were  a  basic  variable  for  the  linear  system  of 
the  matrix  B,  we  would  have  b  =  0.  Since  a  and  b  are  unequal,  they  cannot  both  be  equal  to  0,  and  hence 
xj  cannot  be  a  basic  variable  for  both  linear  systems.  However,  since  the  systems  have  the  same  basic 
variables,  xj  must  then  be  a  free  variable  for  each  system.  We  now  look  at  the  solutions  of  the  systems  in 
which  Xj  is  set  equal  to  1  and  all  other  free  variables  are  set  equal  to  0.  Lor  this  choice  of  parameters,  the 
solution  of  the  system  for  matrix  A  has  xj  —  —a,  while  the  solution  of  the  system  for  matrix  B  has  Xj  —  —b, 
so  that  the  two  systems  have  different  solutions. 

In  case  2,  there  is  a  variable  Xj  which  is  a  basic  variable  for  one  matrix,  let’s  say  A,  and  a  free  variable 
for  the  other  matrix  B.  The  system  for  matrix  B  has  a  solution  in  which  xL  —  1  and  Xj  —  0  for  all  other  free 
variables  xj.  However,  by  Proposition  1.25  this  cannot  be  a  solution  of  the  system  for  the  matrix  A.  This 
completes  the  proof  of  case  2.  4k 

Now,  we  say  that  the  matrix  B  is  equivalent  to  the  matrix  A  provided  that  B  can  be  obtained  from  A 
by  performing  a  sequence  of  elementary  row  operations  beginning  with  A.  The  importance  of  this  concept 
lies  in  the  following  result. 


Theorem  1.27:  Equivalent  Matrices 


The  two  linear  systems  of  equations  corresponding  to  two  equivalent  augmented  matrices  have 
exactly  the  same  solutions. 


The  proof  of  this  theorem  is  left  as  an  exercise. 

Now,  we  can  use  Lemma  1.26  and  Theorem  1.27  to  prove  the  main  result  of  this  section. 


Theorem  1.28:  Uniqueness  of  the  Reduced  Row-Echelon  Form 


Every  matrix  A  is  equivalent  to  a  unique  matrix  in  reduced  row-echelon  form. 


Proof.  Let  A  be  an  m  x  n  matrix  and  let  B  and  C  be  matrices  in  reduced  row-echelon  form,  each  equivalent 
to  A.  It  suffices  to  show  that  B  —  C. 

Let  A+  be  the  matrix  A  augmented  with  a  new  rightmost  column  consisting  entirely  of  zeros.  Similarly, 
augment  matrices  B  and  C  each  with  a  rightmost  column  of  zeros  to  obtain  B+  and  C+.  Note  that  B  and 
C+  are  matrices  in  reduced  row-echelon  form  which  are  obtained  from  A+  by  respectively  applying  the 
same  sequence  of  elementary  row  operations  which  were  used  to  obtain  B  and  C  from  A. 

Now,  A+,  B+ ,  and  C+  can  all  be  considered  as  augmented  matrices  of  homogeneous  linear  systems 
in  the  variables  x\,X2,  ■  ■  ■ , xn .  Because  B  and  C+  are  each  equivalent  to  A+,  Theorem  1.27  ensures  that 
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all  three  homogeneous  linear  systems  have  exactly  the  same  solutions.  By  Lemma  1.26  we  conclude  that 
B+  —  C+.  By  construction,  we  must  also  have  B  —  C.  4 

According  to  this  theorem  we  can  say  that  each  matrix  A  has  a  unique  reduced  row-echelon  form. 

1.2.4.  Rank  and  Homogeneous  Systems 


There  is  a  special  type  of  system  which  requires  additional  study.  This  type  of  system  is  called  a  homo¬ 
geneous  system  of  equations,  which  we  defined  above  in  Definition  1.3.  Our  focus  in  this  section  is  to 
consider  what  types  of  solutions  are  possible  for  a  homogeneous  system  of  equations. 

Consider  the  following  definition. 


If  the  system  has  a  solution  in  which  not  all  of  the  x\,  ■  ■  ■  ,xn  are  equal  to  zero,  then  we  call  this  solution 
nontrivial  .  The  trivial  solution  does  not  tell  us  much  about  the  system,  as  it  says  that  0  =  0!  Therefore, 
when  working  with  homogeneous  systems  of  equations,  we  want  to  know  when  the  system  has  a  nontrivial 
solution. 

Suppose  we  have  a  homogeneous  system  of  m  equations,  using  n  variables,  and  suppose  that  n  >  in. 
In  other  words,  there  are  more  variables  than  equations.  Then,  it  turns  out  that  this  system  always  has 
a  nontrivial  solution.  Not  only  will  the  system  have  a  nontrivial  solution,  but  it  also  will  have  infinitely 
many  solutions.  It  is  also  possible,  but  not  required,  to  have  a  nontrivial  solution  if  n  —  m  and  n  <  m. 

Consider  the  following  example. 


Example  1.30:  Solutions  to  a  Homogeneous  System  of  Equations 


Find  the  nontrivial  solutions  to  the  following  homogeneous  system  of  equations 

2x+y  —  z  —  0 
x  +  2y  —  2z  =  0 


Solution.  Notice  that  this  system  has  m  =  2  equations  and  n  —  3  variables,  so  n  >  m.  Therefore  by  our 
previous  discussion,  we  expect  this  system  to  have  infinitely  many  solutions. 

The  process  we  use  to  find  the  solutions  for  a  homogeneous  system  of  equations  is  the  same  process 
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we  used  in  the  previous  section.  First,  we  construct  the  augmented  matrix,  given  by 


"  2 

1 

-1 

0  ' 

1 

2 

-2 

0 

Then,  we  carry  this  matrix  to  its  reduced  row-echelon  form,  given  below. 


— i 

o 

o 

l 

O 

0  1  -1 

o 

1 _ 

The  corresponding  system  of  equations  is 

x  =  0 
y  —  z  —  0 

Since  z  is  not  restrained  by  any  equation,  we  know  that  this  variable  will  become  our  parameter.  Let  z  —  t 
where  t  is  any  number.  Therefore,  our  solution  has  the  form 

x  —  0 
y  =  z  =  t 
z  =  t 


Hence  this  system  has  infinitely  many  solutions,  with  one  parameter  t.  4k 

Suppose  we  were  to  write  the  solution  to  the  previous  example  in  another  form.  Specifically, 


x  =  0 
y  =  0  +  t 
z  —  0  +  t 


can  be  written  as 


X 

'  0  ' 

'  0  ' 

y 

= 

0 

+ 1 

1 

z 

0 

1 

Notice  that  we  have  constructed  a  column  from  the  constants  in  the  solution  (all  equal  to  0),  as  well  as  a 
column  corresponding  to  the  coefficients  on  t  in  each  equation.  While  we  will  discuss  this  form  of  solution 
more  in  further  chapters,  for  now  consider  the  column  of  coefficients  of  the  parameter  t.  In  this  case,  this 
0  ' 

is  the  column  1 
1 


There  is  a  special  name  for  this  column,  which  is  basic  solution.  The  basic  solutions  of  a  system  are 
columns  constructed  from  the  coefficients  on  parameters  in  the  solution.  We  often  denote  basic  solutions 
by  Xi,X2  etc.,  depending  on  how  many  solutions  occur.  Therefore,  Example  1.30  has  the  basic  solution 


Xi 


0 

1 

1 


We  explore  this  further  in  the  following  example. 
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Example  1.31:  Basic  Solutions  of  a  Homogeneous  System 


Consider  the  following  homogeneous  system  of  equations. 

x  +  Ay  +  3z  =  0 
3x+  12y  +  9z  =  0 

Find  the  basic  solutions  to  this  system. 


Solution.  The  augmented  matrix  of  this  system  and  the  resulting  reduced  row-echelon  form  are 


1  4  3  0 

3  12  9  0 


-> - > 


14  3  0 
0  0  0  0 


When  written  in  equations,  this  system  is  given  by 

x  T  4y  T  3z  =  0 


Notice  that  only  x  corresponds  to  a  pivot  column.  In  this  case,  we  will  have  two  parameters,  one  for  y  and 
one  for  z.  Let  y  —  s  and  z  —  t  for  any  numbers  s  and  t.  Then,  our  solution  becomes 


x  —  —4  s  —  3 1 
y  =  s 
z  =  t 


which  can  be  written  as 


X 

0 

-4 

-3 

y 

= 

0 

+  5 

1 

+ 1 

0 

z 

0 

0 

1 

You  can  see  here  that  we  have  two  columns  of  coefficients  corresponding  to  parameters,  specifically  one 
for  s  and  one  for  t.  Therefore,  this  system  has  two  basic  solutions!  These  are 


Xi  = 

'  -4  " 
1 

,X2  = 

"  -3  ' 
0 

0 

1 

We  now  present  a  new  definition. 


Definition  1.32:  Linear  Combination 


* 


Let  X i,  ■  ■  •  ,Xn,  V  be  column  matrices.  Then  V  is  said  to  be  a  linear  combination  of  the  columns 
X\,  ■  ■  ■  ,Xn  if  there  exist  scalars,  a\,  ■  ■  ■  ,an  such  that 

V  —  G\X\  T  •  •  •  T  cinXn 


A  remarkable  result  of  this  section  is  that  a  linear  combination  of  the  basic  solutions  is  again  a  solution 
to  the  system.  Even  more  remarkable  is  that  every  solution  can  be  written  as  a  linear  combination  of  these 
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solutions.  Therefore,  if  we  take  a  linear  combination  of  the  two  solutions  to  Example  1.31,  this  would  also 
be  a  solution.  For  example,  we  could  take  the  following  linear  combination 


"  -4  ' 

'  -3  ' 

'  -18  ' 

1 

+  2 

0 

= 

3 

0 

1 

2 

You  should  take  a  moment  to  verify  that 


X 

'  -18  ' 

y 

= 

3 

.  z  . 

2 

is  in  fact  a  solution  to  the  system  in  Example  1.31. 

Another  way  in  which  we  can  find  out  more  information  about  the  solutions  of  a  homogeneous  system 
is  to  consider  the  rank  of  the  associated  coefficient  matrix.  We  now  define  what  is  meant  by  the  rank  of  a 
matrix. 


Definition  1.33:  Rank  of  a  Matrix 


Let  A  be  a  matrix  and  consider  any  row-echelon  form  of  A.  Then,  the  number  r  of  leading  entries 
of  A  does  not  depend  on  the  row-echelon  form  you  choose,  and  is  called  the  rank  of  A.  We  denote 
it  by  rank(A) . 


Similarly,  we  could  count  the  number  of  pivot  positions  (or  pivot  columns)  to  determine  the  rank  of  A. 


Solution.  First,  we  need  to  find  the  reduced  row-echelon  form  of  A.  Through  the  usual  algorithm,  we  find 
that  this  is 

'H  o  -i  ' 

0  Q]  2 

.  0  0  0  . 

Here  we  have  two  leading  entries,  or  two  pivot  positions,  shown  above  in  boxes.The  rank  of  A  is  r  =  2. 

* 

Notice  that  we  would  have  achieved  the  same  answer  if  we  had  found  the  row-echelon  form  of  A 
instead  of  the  reduced  row-echelon  form. 

Suppose  we  have  a  homogeneous  system  of  m  equations  in  n  variables,  and  suppose  that  n  >  m.  From 
our  above  discussion,  we  know  that  this  system  will  have  infinitely  many  solutions.  If  we  consider  the 
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rank  of  the  coefficient  matrix  of  this  system,  we  can  find  out  even  more  about  the  solution.  Note  that  we 
are  looking  at  just  the  coefficient  matrix,  not  the  entire  augmented  matrix. 


Theorem  1.35:  Rank  and  Solutions  to  a  Homogeneous  System 


Let  A  be  the  m  x  n  coefficient  matrix  corresponding  to  a  homogeneous  system  of  equations,  and 
suppose  A  has  rank  r.  Then,  the  solution  to  the  corresponding  system  has  n  —  r  parameters. 


Consider  our  above  Example  1.31  in  the  context  of  this  theorem.  The  system  in  this  example  has  m  =  2 
equations  in  n  =  3  variables.  First,  because  n  >  m,  we  know  that  the  system  has  a  nontrivial  solution,  and 
therefore  infinitely  many  solutions.  This  tells  us  that  the  solution  will  contain  at  least  one  parameter.  The 
rank  of  the  coefficient  matrix  can  tell  us  even  more  about  the  solution!  The  rank  of  the  coefficient  matrix 
of  the  system  is  1,  as  it  has  one  leading  entry  in  row-echelon  form.  Theorem  1.35  tells  us  that  the  solution 
will  have  n  —  r  =  3  —  1=  2  parameters.  You  can  check  that  this  is  true  in  the  solution  to  Example  1.31. 

Notice  that  if  n  =  m  or  n  <  in,  it  is  possible  to  have  either  a  unique  solution  (which  will  be  the  trivial 
solution)  or  infinitely  many  solutions. 

We  are  not  limited  to  homogeneous  systems  of  equations  here.  The  rank  of  a  matrix  can  be  used  to 
learn  about  the  solutions  of  any  system  of  linear  equations.  In  the  previous  section,  we  discussed  that  a 
system  of  equations  can  have  no  solution,  a  unique  solution,  or  infinitely  many  solutions.  Suppose  the 
system  is  consistent,  whether  it  is  homogeneous  or  not.  The  following  theorem  tells  us  how  we  can  use 
the  rank  to  learn  about  the  type  of  solution  we  have. 


Theorem  1.36:  Rank  and  Solutions  to  a  Consistent  System  of  Equations 


Let  A  be  the  mx  (n  +  1)  augmented  matrix  corresponding  to  a  consistent  system  of  equations  in  n 
variables,  and  suppose  A  has  rank  r.  Then 

1 .  the  system  has  a  unique  solution  if  r  —  n 

2.  the  system  has  inhnitely  many  solutions  if  r  <  n 


We  will  not  present  a  formal  proof  of  this,  but  consider  the  following  discussions. 

1.  No  Solution  The  above  theorem  assumes  that  the  system  is  consistent,  that  is,  that  it  has  a  solution. 
It  turns  out  that  it  is  possible  for  the  augmented  matrix  of  a  system  with  no  solution  to  have  any 
rank  r  as  long  as  r  >  1.  Therefore,  we  must  know  that  the  system  is  consistent  in  order  to  use  this 
theorem! 

2.  Unique  Solution  Suppose  r  —  n.  Then,  there  is  a  pivot  position  in  every  column  of  the  coefficient 
matrix  of  A.  Hence,  there  is  a  unique  solution. 

3.  Infinitely  Many  Solutions  Suppose  r  <  n.  Then  there  are  infinitely  many  solutions.  There  are  less 
pivot  positions  (and  hence  less  leading  entries)  than  columns,  meaning  that  not  every  column  is  a 
pivot  column.  The  columns  which  are  not  pivot  columns  correspond  to  parameters.  In  fact,  in  this 
case  we  have  n  —  r  parameters. 
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1.2.5.  Balancing  Chemical  Reactions 


The  tools  of  linear  algebra  can  also  be  used  in  the  subject  area  of  Chemistry,  specifically  for  balancing 
chemical  reactions. 

Consider  the  chemical  reaction 

SnCh  +  Hi  — >  Sn  +  H2O 

Here  the  elements  involved  are  tin  (Sn),  oxygen  (O),  and  hydrogen  (H).  A  chemical  reaction  occurs  and 
the  result  is  a  combination  of  tin  (Sn)  and  water  (H2O).  When  considering  chemical  reactions,  we  want 
to  investigate  how  much  of  each  element  we  began  with  and  how  much  of  each  element  is  involved  in  the 
result. 

An  important  theory  we  will  use  here  is  the  mass  balance  theory.  It  tells  us  that  we  cannot  create  or 
delete  elements  within  a  chemical  reaction.  For  example,  in  the  above  expression,  we  must  have  the  same 
number  of  oxygen,  tin,  and  hydrogen  on  both  sides  of  the  reaction.  Notice  that  this  is  not  currently  the 
case.  For  example,  there  are  two  oxygen  atoms  on  the  left  and  only  one  on  the  right.  In  order  to  fix  this, 
we  want  to  find  numbers  x,y,z,w  such  that 

xSn02  +yH2  — »  zSn  +  wH20 

where  both  sides  of  the  reaction  have  the  same  number  of  atoms  of  the  various  elements. 

This  is  a  familiar  problem.  We  can  solve  it  by  setting  up  a  system  of  equations  in  the  variables  x,y,z,w. 
Thus  you  need 

Sn:  x  —  z 
0 :  2x  —  w 
H  :  2 y  —  2  w 

We  can  rewrite  these  equations  as 

Sn:  x  —  z  =  0 
O  :  2x  —  w  —  0 
H :  2y  —  2w  —  0 

The  augmented  matrix  for  this  system  of  equations  is  given  by 


"  1 

0 

-1 

0 

0  ' 

2 

0 

0 

-1 

0 

0 

2 

0 

-2 

0 

The  reduced  row-echelon  form  of  this  matrix  is 


'  1 

0 

0 

1 

2 

0  ' 

0 

1 

0 

-1 

0 

0 

0 

1 

1 

2 

0 

x  —  jw  —  0 
y  —  w  =  0 
z—  \w  —  0 


The  solution  is  given  by 
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which  we  can  write  as 


x=\t 

y  =  t 


w  —  t 


For  example,  let  w  —  2  and  this  would  yield  x—  l,y  =  2,  and  z  =  1.  We  can  put  these  values  back  into 
the  expression  for  the  reaction  which  yields 

Sn02  T  2 H2  — y  Sn  +  2H20 

Observe  that  each  side  of  the  expression  contains  the  same  number  of  atoms  of  each  element.  This  means 
that  it  preserves  the  total  number  of  atoms,  as  required,  and  so  the  chemical  reaction  is  balanced. 

Consider  another  example. 


Example  1.37:  Balancing  a  Chemical  Reaction 


Potassium  is  denoted  by  K ,  oxygen  by  O,  phosphorus  by  P  and  hydrogen  by  H.  Consider  the 
reaction  given  by 

KOH  +  H3P04  ->■  K3P04  +  H20 

Balance  this  chemical  reaction. 


Solution.  We  will  use  the  same  procedure  as  above  to  solve  this  problem.  We  need  to  find  values  for 
x,y,z,w  such  that 

xKOH  +  yH3POA  zK3P04  +  wH2  O 
preserves  the  total  number  of  atoms  of  each  element. 

Finding  these  values  can  be  done  by  finding  the  solution  to  the  following  system  of  equations. 


K  :  x  —  3z 
0\  x  +  4y  —  4z  +  w 
H  :  x  +  3y  —  2w 
P:  y  =  z 


The  augmented  matrix  for  this  system  is 


"  1 

0 

-3 

0 

0  ' 

1 

4 

-4 

-1 

0 

1 

3 

0 

-2 

0 

0 

1 

-1 

0 

0 

"  1 

0 

0 

-1 

0  ' 

0 

1 

0 

1 

3 

0 

0 

0 

1 

1 

3 

0 

_  0 

0 

0 

0 

0  _ 

and  the  reduced  row-echelon  form  is 
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x  —  w  —  0 

y-iw  =  0 

X  =  t 


w  =  t 

Choose  a  value  for  t,  say  3.  Then  w  =  3  and  this  yields  x  —  3 ,y  —  1  ,z  =  1.  It  follows  that  the  balanced 
reaction  is  given  by 

3KOH  +  1H3P04  -a  1 K3P04  +  3 H20 

Note  that  this  results  in  the  same  number  of  atoms  on  both  sides.  4k 

Of  course  these  numbers  you  are  finding  would  typically  be  the  number  of  moles  of  the  molecules  on 
each  side.  Thus  three  moles  of  KOH  added  to  one  mole  of  H3P04  yields  one  mole  of  K3P04  and  three 
moles  of  H30. 

1.2.6.  Dimensionless  Variables 


The  solution  is  given  by 


which  can  be  written  as 


This  section  shows  how  solving  systems  of  equations  can  be  used  to  determine  appropriate  dimensionless 
variables.  It  is  only  an  introduction  to  this  topic  and  considers  a  specific  example  of  a  simple  airplane 
wing  shown  below.  We  assume  for  simplicity  that  it  is  a  flat  plane  at  an  angle  to  the  wind  which  is  blowing 
against  it  with  speed  V  as  shown. 


The  angle  9  is  called  the  angle  of  incidence,  B  is  the  span  of  the  wing  and  A  is  called  the  chord. 
Denote  by  /  the  lift.  Then  this  should  depend  on  various  quantities  like  0,V,B,A  and  so  forth.  Here  is  a 
table  which  indicates  various  quantities  on  which  it  is  reasonable  to  expect  l  to  depend. 


Variable 

Symbol 

Units 

chord 

A 

77? 

span 

B 

77? 

angle  incidence 

9 

m°kg° sec0 

speed  of  wind 

V 

77?  sec- 1 

speed  of  sound 

Vo 

77?  sec-1 

density  of  air 

P 

kgm  5 

viscosity 

P 

kg  sec-1 m  v 

lift 

l 

kg  sec-2  m 
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Here  m  denotes  meters,  sec  refers  to  seconds  and  kg  refers  to  kilograms.  All  of  these  are  likely  familiar 
except  for  /i,  which  we  will  discuss  in  further  detail  now. 

Viscosity  is  a  measure  of  how  much  internal  friction  is  experienced  when  the  fluid  moves.  It  is  roughly 
a  measure  of  how  “sticky"  the  fluid  is.  Consider  a  piece  of  area  parallel  to  the  direction  of  motion  of  the 
fluid.  To  say  that  the  viscosity  is  large  is  to  say  that  the  tangential  force  applied  to  this  area  must  be  large 
in  order  to  achieve  a  given  change  in  speed  of  the  fluid  in  a  direction  normal  to  the  tangential  force.  Thus 

/l  (area)  (velocity  gradient)  =  tangential  force 

Hence 

(units  on  u)  m 2  f - )  =  kg  sec  2  m 

V  sec  m  / 

Thus  the  units  on  p  are 

kg  sec^1  m_1 

as  claimed  above. 

Returning  to  our  original  discussion,  you  may  think  that  we  would  want 

l  =  f(A,B,e,V,V0,p ,ju) 

This  is  very  cumbersome  because  it  depends  on  seven  variables.  Also,  it  is  likely  that  without  much  care, 
a  change  in  the  units  such  as  going  from  meters  to  feet  would  result  in  an  incorrect  value  for  /.  The  way  to 
get  around  this  problem  is  to  look  for  /  as  a  function  of  dimensionless  variables  multiplied  by  something 
which  has  units  of  force.  It  is  helpful  because  first  of  all,  you  will  likely  have  fewer  independent  variables 
and  secondly,  you  could  expect  the  formula  to  hold  independent  of  the  way  of  specifying  length,  mass  and 
so  forth.  One  looks  for 

l  =  f(gu---,gk)pV2AB 

where  the  units  on  pV2AB  are 

kg  (  m  \  2  2  x  m 

-3  (  —  m  = - 9" 

V  sec  /  sec- 

which  are  the  units  of  force.  Each  of  these  gi  is  of  the  form 

AX1  Bx-  0X3  VM  V*5  pX(>  pxi  (1.11) 

and  each  gi  is  independent  of  the  dimensions.  That  is,  this  expression  must  not  depend  on  meters,  kilo¬ 
grams,  seconds,  etc.  Thus,  placing  in  the  units  for  each  of  these  quantities,  one  needs 

mX]  rrf2  (mX4  sec-*4)  (m*5  sec-*5)  (kgm-3)'*6  (kgsec-1  m-1)*7  =m°kg°  sec0 

Notice  that  there  are  no  units  on  0  because  it  is  just  the  radian  measure  of  an  angle.  Hence  its  dimensions 
consist  of  length  divided  by  length,  thus  it  is  dimensionless.  Then  this  leads  to  the  following  equations  for 
the  Xi. 

m  :  x  1  +  a'2  +  X4  +  ,i'5  —  3x6  —  X7  =  0 
sec  :  —  X4  —  X5  —  X7  =  0 

kg:  x6+X7  =  0 

The  augmented  matrix  for  this  system  is 


"  1 

1 

0 

1 

1 

-3 

-1 

0  " 

0 

0 

0 

1 

1 

0 

1 

0 

0 

0 

0 

0 

0 

1 

1 

0 
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The  reduced  row-echelon  form  is  given  by 


"  1 

1 

0 

0 

0 

0 

1 

0  ' 

0 

0 

0 

1 

1 

0 

1 

0 

0 

0 

0 

0 

0 

1 

1 

0 

and  so  the  solutions  are  of  the  form 


X| 

=  -X2 

-x7 

*3 

=  *3 

X4 

=  -*5 

-X? 

x6 

=  -Xi 

Thus,  in  terms  of  vectors,  the  solution  is 


X| 

— X2  -  X7 

X2 

X2 

X3 

X3 

X4 

= 

— X5  x7 

*5 

*5 

x6 

-x7 

.  X7  . 

x~i 

Thus  the  free  variables  are  X2,X3,X5,x7.  By  assigning  values  to  these,  we  can  obtain  dimensionless  variables 
by  placing  the  values  obtained  for  the  x,  in  the  formula  1.11.  For  example,  let  X2  —  l  and  all  the  rest  of  the 
free  variables  are  0.  This  yields 


X\  —  — 1,X2  =  1,X3  =  0,X4  =  0,X5  =  0,X6  =  0,X7  =  0 

The  dimensionless  variable  is  then  A  {Bl .  This  is  the  ratio  between  the  span  and  the  chord.  It  is  called 
the  aspect  ratio,  denoted  as  AR.  Next  let  X3  =  1  and  all  others  equal  zero.  This  gives  for  a  dimensionless 
quantity  the  angle  6.  Next  let  X5  =  1  and  all  others  equal  zero.  This  gives 

Xi  =  0,X2  =  0,X3  =  0,X4  =  —  1 ,  X5  =  1,X6  =  0,X7  =  0 

Then  the  dimensionless  variable  is  V-1Vo  .  However,  it  is  written  as  V /Vq.  This  is  called  the  Mach  number 
,.M .  Finally,  let  X7  =  1  and  all  the  other  free  variables  equal  0.  Then 

Xi  =  —  1  ,X2  =  0,X3  =  0,X4  =  —  1 ,  X5  =  0,X6  =  —1 ,  X7  =  1 

then  the  dimensionless  variable  which  results  from  this  is  A  W  1  p  1  /1 .  It  is  customary  to  write  it  as 
Re  =  (AV p)  /  p.  This  one  is  called  the  Reynold’s  number.  It  is  the  one  which  involves  viscosity.  Thus  we 
would  look  for 

l  —  f  (Re,AR,  Q,^)kg  x  ml  sec2 

This  is  quite  interesting  because  it  is  easy  to  vary  Re  by  simply  adjusting  the  velocity  or  A  but  it  is  hard  to 
vary  things  like  p  or  p.  Note  that  all  the  quantities  are  easy  to  adjust.  Now  this  could  be  used,  along  with 
wind  tunnel  experiments  to  get  a  formula  for  the  lift  which  would  be  reasonable.  You  could  also  consider 
more  variables  and  more  complicated  situations  in  the  same  way. 


38  Systems  of  Equations 


1.2.7.  An  Application  to  Resistor  Networks 


The  tools  of  linear  algebra  can  be  used  to  study  the  application  of  resistor  networks.  An  example  of  an 
electrical  circuit  is  below. 


2a 


The  jagged  lines  (  'WV  )  denote  resistors  and  the  numbers  next  to  them  give  their  resistance  in 
ohms,  written  as  Q.  The  voltage  source  (  'I  )  causes  the  current  to  flow  in  the  direction  from  the  shorter 
of  the  two  lines  toward  the  longer  (as  indicated  by  the  arrow).  The  current  for  a  circuit  is  labelled  //.. 

In  the  above  figure,  the  current  7i  has  been  labelled  with  an  arrow  in  the  counter  clockwise  direction. 
This  is  an  entirely  arbitrary  decision  and  we  could  have  chosen  to  label  the  current  in  the  clockwise 
direction.  With  our  choice  of  direction  here,  we  define  a  positive  current  to  flow  in  the  counter  clockwise 
direction  and  a  negative  current  to  flow  in  the  clockwise  direction. 

The  goal  of  this  section  is  to  use  the  values  of  resistors  and  voltage  sources  in  a  circuit  to  determine 
the  current.  An  essential  theorem  for  this  application  is  Kirchhoff ’s  law. 


Theorem  1.38:  Kirchhoff’s  Law 


The  sum  of  the  resistance  (R)  times  the  amps  ( I )  in  the  counter  clockwise  direction  around  a  loop 
equals  the  sum  of  the  voltage  sources  (V)  in  the  same  direction  around  the  loop. 


Kirchhoff’s  law  allows  us  to  set  up  a  system  of  linear  equations  and  solve  for  any  unknown  variables. 
When  setting  up  this  system,  it  is  important  to  trace  the  circuit  in  the  counter  clockwise  direction.  If  a 
resistor  or  voltage  source  is  crossed  against  this  direction,  the  reltaed  term  must  be  given  a  negative  sign. 

We  will  explore  this  in  the  next  example  where  we  determine  the  value  of  the  current  in  the  initial 
diagram. 
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Solution.  Begin  in  the  bottom  left  corner,  and  trace  the  circuit  in  the  counter  clockwise  direction.  At  the 
first  resistor,  multiplying  resistance  and  current  gives  2 1\ .  Continuing  in  this  way  through  all  three  resistors 
gives  2I\  +  41 1  +  21 1 .  This  must  equal  the  voltage  source  in  the  same  direction.  Notice  that  the  direction 
of  the  voltage  source  matches  the  counter  clockwise  direction  specified,  so  the  voltage  is  positive. 

Therefore  the  equation  and  solution  are  given  by 


21 1  +  41 1  +  21 1  =  18 

8/i  =  18 


Since  the  answer  is  positive,  this  confirms  that  the  current  flows  counter  clockwise. 


* 


Solution.  Begin  in  the  top  left  corner  this  time,  and  trace  the  circuit  in  the  counter  clockwise  direction. 
At  the  first  resistor,  multiplying  resistance  and  current  gives  4I\ .  Continuing  in  this  way  through  the  four 
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resistors  gives  AI\  +  6I\  +  \1\  +  3/j .  This  must  equal  the  voltage  source  in  the  same  direction.  Notice  that 
the  direction  of  the  voltage  source  is  opposite  to  the  counter  clockwise  direction,  so  the  voltage  is  negative. 

Therefore  the  equation  and  solution  are  given  by 

47]  +6/1  +  17]  +3/i  =  -27 
147i  =  -27 


Since  the  answer  is  negative,  this  tells  us  that  the  current  flows  clockwise. 


* 


A  more  complicated  example  follows.  Two  of  the  circuits  below  may  be  familiar;  they  were  examined 
in  the  examples  above.  However  as  they  are  now  part  of  a  larger  system  of  circuits,  the  answers  will  differ. 


Example  1.41:  Unknown  Currents 


The  diagram  below  consists  of  four  circuits.  The  current  (7/J  in  the  four  circuits  is  denoted  by 
7i,72,73,74.  Using  Kirchhoff’s  Law,  write  an  equation  for  each  circuit  and  solve  for  each  current. 


Solution.  The  circuits  are  given  in  the  following  diagram. 


Starting  with  the  top  left  circuit,  multiply  the  resistance  by  the  amps  and  sum  the  resulting  products. 
Specifically,  consider  the  resistor  labelled  20  that  is  part  of  the  circuits  of  l\  and  h.  Notice  that  current  I2 
runs  through  this  in  a  positive  (counter  clockwise)  direction,  and  1\  runs  through  in  the  opposite  (negative) 
direction.  The  product  of  resistance  and  amps  is  then  2(72  —7i)  =  272  —  -h-  Continue  in  this  way  for  each 
resistor,  and  set  the  sum  of  the  products  equal  to  the  voltage  source  to  write  the  equation: 


272  -  27i  +  472  -  473  +  272  =  18 
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The  above  process  is  used  on  each  of  the  other  three  circuits,  and  the  resulting  equations  are: 

Upper  right  circuit: 

4/3  -  4/2  A  6/3  -  6/4  A 13  A  3/3  =  -27 

Lower  right  circuit: 

3/4  A  2/4  T  6/4  —  6/3  A  f 4  —  1 1  —  0 

Lower  left  circuit: 

5/i  +I1-I4  +  2/i  -  2/2  =  -23 

Notice  that  the  voltage  for  the  upper  right  and  lower  left  circuits  are  negative  due  to  the  clockwise 
direction  they  indicate. 

The  resulting  system  of  four  equations  in  four  unknowns  is 

2/2 -2/i +4/2 -4/3 +  2/2  -  18 
4/3  —  4/2  A  6/3  —  6/4  A/3A/3  —  — 27 
2/4  A  3/4  A  6/4  —  6/3  A  Z4  — 1\  —  0 
5/|  A  /1  -  /4  A  27,  -  2/2  =  -23 

Simplifying  and  rearranging  with  variables  in  order,  we  have: 

-2IX  A  8/2  -4/3  -  18 
—4A  A  14/3  —  6/4  =  -27 
-h-  6/3  A  12/4  =  0 
8/,  -  2/2  -  h  =  -23 

The  augmented  matrix  is 


'  -2 

8 

-4 

0 

18  ' 

0 

-4 

14 

-6 

-27 

-1 

0 

-6 

12 

0 

8 

-2 

0 

-1 

-23 

The  solution  to  this  matrix  is 


h  =  -3A 


This  tells  us  that  currents  I1J3,  and  74  travel  clockwise  while  h  travels  counter  clockwise. 


* 
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Exercises 


Exercise  1.2.1  Find  the  point  (xi,yi)  which  lies  on  both  lines,  x  +  3 y  =  1  and  Ax  —  y  —  3. 

Exercise  1.2.2  Find  the  point  of  intersection  of  the  two  lines  3x  +  y  =  3  and  x  +  2y  =  1. 

Exercise  1.2.3  Do  the  three  lines,  x  +  2 y  —  1,2 x  —  y—  1,  and  4x  +  3y  —  3  have  a  common  point  of 
intersection?  If  so,  find  the  point  and  if  not,  tell  why  they  don’t  have  such  a  common  point  of  intersection. 

Exercise  1.2.4  Do  the  three  planes,  x  +  y  —  3z  —  2,  2x  +  y  +  z=  1 ,  and  3x  +  2y  —  2z  —  0  have  a  common 
point  of  intersection?  If  so,  find  one  and  if  not,  tell  why  there  is  no  such  point. 

Exercise  1.2.5  Four  times  the  weight  of  Gaston  is  150  pounds  more  than  the  weight  of  Ichabod.  Four 
times  the  weight  of  Ichabod  is  660  pounds  less  than  seventeen  times  the  weight  of  Gaston.  Four  times  the 
weight  of  Gaston  plus  the  weight  of  Siegfried  equals  290  pounds.  Brunhilde  would  balance  cdl  three  of  the 
others.  Find  the  weights  of  the  four  people. 

Exercise  1.2.6  Consider  the  following  augmented  matrix  in  which  *  denotes  an  arbitrary  number  and  ■ 
denotes  a  nonzero  number.  Determine  whether  the  given  augmented  matrix  is  consistent.  If  consistent,  is 
the  solution  unique? 


"  ■ 

* 

* 

* 

* 

* 

0 

■ 

* 

* 

0 

* 

0 

0 

■ 

* 

* 

* 

0 

0 

0 

0 

■ 

* 

Exercise  1.2.7  Consider  the  following  augmented  matrix  in  which  *  denotes  an  arbitrary  number  and  ■ 
denotes  a  nonzero  number.  Determine  whether  the  given  augmented  matrix  is  consistent.  If  consistent,  is 
the  solution  unique? 


'  ■ 

* 

* 

* 

0 

■ 

* 

* 

0 

0 

u 

* 

Exercise  1.2.8  Consider  the  following  augmented  matrix  in  which  *  denotes  an  arbitrary  number  and  ■ 
denotes  a  nonzero  number.  Determine  whether  the  given  augmented  matrix  is  consistent.  If  consistent,  is 
the  solution  unique? 


'  m 

* 

* 

* 

* 

* 

0 

■ 

0 

* 

0 

* 

0 

0 

0 

■ 

* 

* 

0 

0 

0 

0 

■ 

* 

Exercise  1.2.9  Consider  the  following  augmented  matrix  in  which  *  denotes  an  arbitrary  number  and  ■ 
denotes  a  nonzero  number.  Determine  whether  the  given  augmented  matrix  is  consistent.  If  consistent,  is 
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the  solution  unique? 


'  u 

* 

* 

* 

* 

* 

0 

■ 

* 

* 

0 

* 

0 

0 

0 

0 

■ 

0 

0 

0 

0 

0 

* 

■ 

Exercise  1.2.10  Suppose  a  system  of  equations  has  fewer  equations  than  variables.  Will  such  a  system 
necessarily  be  consistent?  If  so,  explain  why  and  if  not,  give  an  example  which  is  not  consistent. 

Exercise  1.2.11  If  a  system  of  equations  has  more  equations  than  variables,  can  it  have  a  solution?  If  so, 
give  an  example  and  if  not,  tell  why  not. 

Exercise  1.2.12  Find  h  such  that 


"  2  h 

i 

3  6 

■'-J 

1 _ 

is  the  augmented  matrix  of  an  inconsistent  system. 
Exercise  1.2.13  Find  h  such  that 


- 1 

_ l 

[  2  4 

6  J 

is  the  augmented  matrix  of  a  consistent  system. 
Exercise  1.2.14  Find  h  such  that 


"  1 

1 

4  ' 

3 

h 

12 

is  the  augmented  matrix  of  a  consistent  system. 


Exercise  1.2.15  Choose  h  and  k  such  that  the  augmented  matrix  shown  has  each  of  the  following: 

(a)  one  solution 

(b)  no  solution 

(c)  infinitely  many  solutions 


- 1 

21 

[  2  4 

I _ 

Exercise  1.2.16  Choose  h  and  k  such  that  the  augmented  matrix  shown  has  each  of  the  following: 

(a)  one  solution 

(b)  no  solution 

(c)  infinitely  many  solutions 
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'  1  2 

2  ' 

2  h 

k 

Exercise  1.2.17  Determine  if  the  system  is  consistent.  If  so,  is  the  solution  unique? 

x  +  2y  +  z  —  w  =  2 
x-y+z+w =  1 
2 x  +  y-z= 1 
4x  +  2y  +  z  =  5 


Exercise  1.2.18  Determine  if  the  system  is  consistent.  If  so,  is  the  solution  unique? 

x  +  2y  +  z  —  w  =  2 
x  —  y  +  z  +  w  —  0 
2 x  +  y  —  z= 1 
4x  +  2y  +  z  =  3 


Exercise  1.2.19  Determine  which  matrices  are  in  reduced  row-echelon  form. 


(a) 


1  2  0 
0  1  7 


'  1 

0 

0 

0 

(b) 

0 

0 

1 

2 

0 

0 

0 

0 

(c) 


1  1  0  0  0  5 
0  0  1  2  0  4 
0  0  0  0  1  3 


Exercise  1.2.20  Row  reduce  the  following  matrix  to  obtain  the  row-echelon  form.  Then  continue  to  obtain 
the  reduced  row-echelon  form. 

2  —1  3  —  1 
10  2  1 
1-11-2 


Exercise  1.2.21  Row  reduce  the  following  matrix  to  obtain  the  row-echelon  form.  Then  continue  to  obtain 
the  reduced  row-echelon  form. 

"00-1-1" 

1110 

11  0-1 
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Exercise  1.2.22  Row  reduce  the  following  matrix  to  obtain  the  row-echelon  form.  Then  continue  to  obtain 
the  reduced  row-echelon  form. 

"3  -6  -7  -8 ' 

1  -2  -2  -2 

1  -2  -3  -4 


Exercise  1.2.23  Row  reduce  the  following  matrix  to  obtain  the  row-echelon  form.  Then  continue  to  obtain 
the  reduced  row-echelon  form. 

"  2  4  5  15  ' 

12  3  9 

12  2  6 


Exercise  1.2.24  Row  reduce  the  following  matrix  to  obtain  the  row-echelon  form.  Then  continue  to  obtain 
the  reduced  row-echelon  form. 

'4-1  7  10  ' 

10  3  3 

1-1-2  1 


Exercise  1.2.25  Row  reduce  the  following  matrix  to  obtain  the  row-echelon  form.  Then  continue  to  obtain 
the  reduced  row-echelon  form. 

"  3  5  —4  2  ' 

12-11 

11-20 


Exercise  1.2.26  Row  reduce  the  following  matrix  to  obtain  the  row-echelon  form.  Then  continue  to  obtain 
the  reduced  row-echelon  form. 

"  -2  3  -8  7  " 

1-2  5-5 

1-3  7-8 


Exercise  1.2.27  Find  the  solution  of  the  system  whose  augmented  matrix  is 


"  1 

2 

0 

2  ' 

1 

3 

4 

2 

1 

0 

2 

1 

Exercise  1.2.28  Find  the  solution  of  the  system  whose  augmented  matrix  is 


"  1 

2 

0 

2  ' 

2 

0 

1 

1 

3 

2 

1 

3 
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Exercise  1.2.29  Find  the  solution  of  the  system  whose  augmented  matrix  is 


r  i  i  o 

1  ' 

i 

o 

4^ 

2 

Exercise  1.2.30  Find  the  solution  of  the  system  whose  augmented  matrix  is 


'  1 

0 

2 

1 

1 

2  ' 

0 

1 

0 

1 

2 

1 

1 

2 

0 

0 

1 

3 

1 

0 

1 

0 

2 

2 

Exercise  1.2.31  Find  the  solution  of  the  system  whose  augmented  matrix  is 


'  1 

0 

2 

1 

1 

2  ' 

0 

1 

0 

1 

2 

1 

0 

2 

0 

0 

1 

3 

1 

-1 

2 

2 

2 

0 

Exercise  1.2.32  Find  the  solution  to  the  system  of  equations,  7x  +  14 y  +  15z  =  22,  2x  +  4y  +  3z  =  5,  and 
3x  +  6y  +  lOz  =  13. 

Exercise  1.2.33  Find  the  solution  to  the  system  of  equations,  3x  —  y  +  4z  =  6,  y  +  8z  =  0,  and  — 2x  +  y  = 
-4. 

Exercise  1.2.34  Find  the  solution  to  the  system  of  equations,  9x  —  2y  +  4z  —  — 17,  I3x  —  3y  +  6z  —  —25, 
and  —2 x  — z  =  3. 

Exercise  1.2.35  Find  the  solution  to  the  system  of  equations,  65x  +  84y  +  16z  =  546,  8  lx  +  1 05y  +  20z  = 
682,  and  84x  +  110y  +  21z  =  713. 

Exercise  1.2.36  Find  the  solution  to  the  system  of  equations,  8x  +  2y  +  3z  =  — 3, 8x  +  3y  +  3z  =  —  1 ,  and 
4x  +  y  +  3z  =  —9. 

Exercise  1.2.37  Find  the  solution  to  the  system  of  equations,  — 8x  +  2y  +  5z  =  18,— 8x  +  3y  +  5z  =13, 
and  — 4x  +  y  +  5z  =  19. 

Exercise  1.2.38  Find  the  solution  to  the  system  of  equations,  3x  —  y  —  2z  —  3,  y  —  4z  —  0,  and  —2 x  +  y  = 
-2. 

Exercise  1.2.39  Find  the  solution  to  the  system  of  equations,  —  9x+  1 5y  =  66,  —  1  lx+  1 8y  =  79,  —x+y  = 
4,  and  z  —  3. 
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Exercise  1.2.40  Find  the  solution  to  the  system  of  equations,  —  19.r  +  8y  =  —108,  —7  \x  +  30  y  =  —404, 
—2 x  +  y  —  —12,  4x  +  z  —  14. 

Exercise  1.2.41  Suppose  a  system  of  equations  has  fewer  equations  than  variables  and  you  have  found  a 
solution  to  this  system  of  equations.  Is  it  possible  that  your  solution  is  the  only  one?  Explain. 

Exercise  1.2.42  Suppose  a  system  of  linear  equations  has  a  2x4  augmented  matrix  and  the  last  column 
is  a  pivot  column.  Could  the  system  of  linear  equations  be  consistent?  Explain. 

Exercise  1.2.43  Suppose  the  coefficient  matrix  of  a  system  ofn  equations  with  n  variables  has  the  property 
that  every  column  is  a  pivot  column.  Does  it  follow  that  the  system  of  equations  must  have  a  solution?  If 
so,  must  the  solution  be  unique?  Explain. 

Exercise  1.2.44  Suppose  there  is  a  unique  solution  to  a  system  of  linear  equations.  What  must  be  true  of 
the  pivot  columns  in  the  augmented  matrix? 

Exercise  1.2.45  The  steady  state  temperature,  u,  of  a  plate  solves  Laplace’s  equation,  A  u  —  0.  One  way 
to  approximate  the  solution  is  to  divide  the  plate  into  a  square  mesh  and  require  the  temperature  at  each 
node  to  equcd  the  average  of  the  temperature  at  the  four  adjacent  nodes.  In  the  following  picture,  the 
numbers  represent  the  observed  temperature  at  the  indicated  nodes.  Find  the  temperature  at  the  interior 
nodes,  indicated  by  x,y,z,  and  w.  One  of  the  equations  is  z  —  ^(lO  +  O  +  w  +  x). 

0 
0 

Exercise  1.2.46  Find  the  rank  of  the  following  matrix. 

"  4  -16  -1  -5  ' 

1-4  0-1 

1  -4  -1  -2 

Exercise  1.2.47  Find  the  rank  of  the  following  matrix. 

'  3  6  5  12  ' 

12  2  5 

12  12 

Exercise  1.2.48  Find  the  rank  of  the  following  matrix. 

"  0  0  -1  0  3  ’ 

1410-8 
14  0  12 

-1  -4  0  -1  -2 
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Exercise  1.2.49  Find  the  rank  of  the  following  matrix. 

'  4  -4  3  -9  ' 
1-11-2 
1  -1  0  -3  _ 

Exercise  1.2.50  Find  the  rank  of  the  following  matrix. 

"20101' 

10  10  0 
10  0  17 

10  0  17 

Exercise  1.2.51  Find  the  rank  of  the  following  matrix. 

"  4  15  29  ' 

1  4  8 

1  3  5 

_3  9  15  _ 

Exercise  1.2.52  Find  the  rank  of  the  following  matrix. 

"00-10  1  ' 
12  3-2  -18 

12  2-1  -11 
-1  -2  -2  1  11 

Exercise  1.2.53  Find  the  rank  of  the  following  matrix. 

"  1  -2  0  3  11  ' 

1  -2  0  4  15 

I  -2  0  3  11 

_  0  0  0  0  0  _ 

Exercise  1.2.54  Find  the  rank  of  the  following  matrix. 

'  -2  -3  -2  " 

1  1  1 

1  0  1 

-3  0  -3  _ 

Exercise  1.2.55  Find  the  rank  of  the  following  matrix. 

'  4  4  20  -1  17  ' 

115  0  5 

II  5-1  2 

3  3  15  -3  6 
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Exercise  1.2.56  Find  the  rank  of  the  following  matrix. 

"  -1  3  4-3  8 

1-3-4  2  -5 

1-3-4  1  -2 

_  -2  6  8  -2  4  _ 

Exercise  1.2.57  Suppose  A  is  an  m  x  n  matrix.  Explain  why  the  rank  of  A  is  always  no  larger  than 
mm(m,n) . 

Exercise  1.2.58  State  whether  each  of  the  following  sets  of  data  are  possible  for  the  matrix  equation 
AX  —  B.  If  possible,  describe  the  solution  set.  That  is,  tell  whether  there  exists  a  unique  solution,  no 
solution  or  infinitely  many  solutions.  Here,  \A\B\  denotes  the  augmented  matrix. 

(a)  A  is  a  5  x  6  matrix,  rank  (A)  =  4  and  rank  \A  \  B]  —  4. 

(b)  A  is  a  3x4  matrix,  rank(A)  =  3  and  rank[A\B\  —  2. 

(c)  A  is  a  4x2  matrix,  rank(A )  =  4  and  rank  [ A\B ]  =  4. 

(d)  A  is  a  5x5  matrix,  rank(A)  =  4  and  rank[A\B]  =  5. 

(e)  A  is  a  4x2  matrix,  rank  (A )  =  2  and  rank  \A\B]  =  2. 

Exercise  1.2.59  Consider  the  system  —5x  +  2y  —  z  =  0  and  —5x  —  2y  —  z  =  0.  Both  equations  equal  zero 
and  so  —5x  +  2 y  —  z=  —5x  —  2 y  —  z  which  is  equivalent  to  y  —  0.  Does  it  follow  that  x  and  z.  can  equal 
anything?  Notice  that  when  x  =  1,  z  =  —4,  and  y  —  0  are  plugged  in  to  the  equations,  the  equations  do 
not  equal  0.  Why? 

Exercise  1.2.60  Balance  the  following  chemical  reactions. 

(a)  KN03  +H2C03  ->  K2C03+HN03 

( b )  Agl  +  Na2S  — >  Ag2S  +  Ned 

(c)  Ba3N2  +  H20  — >  Ba  ( OH)2  +  NH3 

(d)  CaCl2+Na3P04 — 't  Ca3  (POfj  F  NaCl 

Exercise  1.2.61  In  the  section  on  dimensionless  variables  it  was  observed  that  pV2AB  has  the  units  of 
force.  Describe  a  systematic  way  to  obtain  such  combinations  of  the  variables  which  will  yield  something 
which  has  the  units  of  force. 


Exercise  1.2.62  Consider  the  following  diagram  of  four  circuits. 
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4Q  2Q 


The  current  in  amps  in  the  four  circuits  is  denoted  by  I1J2J3J4  and  it  is  understood  that  the  motion  is 
in  the  counter  clockwise  direction.  Iff  ends  up  being  negative,  then  it  just  means  the  current  flows  in  the 
clockwise  direction. 

In  the  above  diagram,  the  top  left  circuit  should  give  the  equation 

2/2  -2IX  +  5/2  -5/3  +  3/2  =  5 
For  the  circuit  on  the  lower  left,  you  should  have 

4/i  +  f  ~  h  +  2/i  -  2/2  =  - 1 0 

Write  equations  for  each  of  the  other  two  circuits  and  then  give  a  solution  to  the  resulting  system  of 
equations. 

Exercise  1.2.63  Consider  the  following  diagram  of  three  circuits. 
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The  current  in  amps  in  the  four  circuits  is  denoted  by  l\J2,h  and  it  is  understood  that  the  motion  is 
in  the  counter  clockwise  direction.  Iff  ends  up  being  negative,  then  it  just  means  the  current  flows  in  the 
clockwise  direction. 

Find 


2.  Matrices 


2.1  Matrix  Arithmetic 


Outcomes 


A.  Perform  the  matrix  operations  of  matrix  addition,  scalar  multiplication,  transposition  and  ma¬ 
trix  multiplication.  Identify  when  these  operations  are  not  defined.  Represent  these  operations 
in  terms  of  the  entries  of  a  matrix. 

B.  Prove  algebraic  properties  for  matrix  addition,  scalar  multiplication,  transposition,  and  ma¬ 
trix  multiplication.  Apply  these  properties  to  manipulate  an  algebraic  expression  involving 
matrices. 

C.  Compute  the  inverse  of  a  matrix  using  row  operations,  and  prove  identities  involving  matrix 
inverses. 

E.  Solve  a  linear  system  using  matrix  algebra. 

F.  Use  multiplication  by  an  elementary  matrix  to  apply  row  operations. 

G.  Write  a  matrix  as  a  product  of  elementary  matrices. 


You  have  now  solved  systems  of  equations  by  writing  them  in  terms  of  an  augmented  matrix  and 
then  doing  row  operations  on  this  augmented  matrix.  It  turns  out  that  matrices  are  important  not  only  for 
systems  of  equations  but  also  in  many  applications. 


Recall  that  a  matrix  is  a  rectangular  array  of  numbers. 
For  example,  here  is  a  matrix. 

[1  2  3  4 

5  2  8  7 

6-912 


Several  of  them  are  referred  to  as  matrices. 


(2.1) 


Recall  that  the  size  or  dimension  of  a  matrix  is  defined  as  m  x  n  where  m  is  the  number  of  rows  and  n  is 
the  number  of  columns.  The  above  matrix  is  a  3  x  4  matrix  because  there  are  three  rows  and  four  columns. 
You  can  remember  the  columns  are  like  columns  in  a  Greek  temple.  They  stand  upright  while  the  rows 
lay  flat  like  rows  made  by  a  tractor  in  a  plowed  field. 


When  specifying  the  size  of  a  matrix,  you  always  list  the  number  of  rows  before  the  number  of 
columns. You  might  remember  that  you  always  list  the  rows  before  the  columns  by  using  the  phrase 
Rowman  Catholic. 
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Consider  the  following  definition. 


Definition  2.1:  Square  Matrix 


A  matrix  A  which  has  size  nx  n  is  called  a  square  matrix  .  In  other  words,  A  is  a  square  matrix  if 
it  has  the  same  number  of  rows  and  columns. 


There  is  some  notation  specific  to  matrices  which  we  now  introduce.  We  denote  the  columns  of  a 
matrix  A  by  Aj  as  follows 

A  —  [  A\  A2  ■■■  An  ] 

Therefore,  Aj  is  the  jth  column  of  A,  when  counted  from  left  to  right. 

The  individual  elements  of  the  matrix  are  called  entries  or  components  of  A.  Elements  of  the  matrix 
are  identified  according  to  their  position.  The  (i,j)-entry  of  a  matrix  is  the  entry  in  the  ith  row  and  jth 
column.  For  example,  in  the  matrix  2. 1  above,  8  is  in  position  (2, 3)  (and  is  called  the  (2, 3)-entry)  because 
it  is  in  the  second  row  and  the  third  column. 

In  order  to  remember  which  matrix  we  are  speaking  of,  we  will  denote  the  entry  in  the  ith  row  and 
the  jth  column  of  matrix  A  by  a,y.  Then,  we  can  write  A  in  terms  of  its  entries,  as  A  =  [a,y] .  Using  this 
notation  on  the  matrix  in  2.1,  <223  =  8, <332  =  —  9, an  —  2,  etc. 

There  are  various  operations  which  are  done  on  matrices  of  appropriate  sizes.  Matrices  can  be  added 
to  and  subtracted  from  other  matrices,  multiplied  by  a  scalar,  and  multiplied  by  other  matrices.  We  will 
never  divide  a  matrix  by  another  matrix,  but  we  will  see  later  how  matrix  inverses  play  a  similar  role. 

In  doing  arithmetic  with  matrices,  we  often  define  the  action  by  what  happens  in  terms  of  the  entries 
(or  components)  of  the  matrices.  Before  looking  at  these  operations  in  depth,  consider  a  few  general 
definitions. 


Definition  2.2:  The  Zero  Matrix 


The  m  x  n  zero  matrix  is  the  m  x  n  matrix  having  every  entry  equal  to  zero.  It  is  denoted  by  0. 


One  possible  zero  matrix  is  shown  in  the  following  example. 


Example  2.3:  The  Zero  Matrix 

The  2x3  zero  matrix  is  0  = 

0  0 

0  0 

0  0 

Note  there  is  a  2  x  3  zero  matrix,  a  3  x  4  zero  matrix,  etc.  In  fact  there  is  a  zero  matrix  for  every  size! 
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In  other  words,  two  matrices  are  equal  exactly  when  they  are  the  same  size  and  the  corresponding 
entries  are  identical.  Thus 


"00" 

7^ 

0  0’ 

0  0 

0  0 

1 

O 

O 

_ i 

because  they  are  different  sizes.  Also, 


0  1 
3  2 


7^ 


1  0 
2  3 


because,  although  they  are  the  same  size,  their  corresponding  entries  are  not  identical. 
In  the  following  section,  we  explore  addition  of  matrices. 


2.1.1.  Addition  of  Matrices 


When  adding  matrices,  all  matrices  in  the  sum  need  have  the  same  size.  For  example, 

"12' 

3  4 
_  5  2  _ 

and 

-14  8 

2  8  5 

cannot  be  added,  as  one  has  size  3x2  while  the  other  has  size  2x3. 

However,  the  addition 


4 

6 

3  ' 

"  0 

5 

0  ' 

5 

0 

4 

+ 

4 

-4 

14 

11 

-2 

3 

1 

2 

6 

is  possible. 

The  formal  definition  is  as  follows. 


This  definition  tells  us  that  when  adding  matrices,  we  simply  add  corresponding  entries  of  the  matrices. 
This  is  demonstrated  in  the  next  example. 
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Solution.  Notice  that  both  A  and  B  are  of  size  2x3.  Since  A  and  B  are  of  the  same  size,  the  addition  is 
possible.  Using  Definition  2.5,  the  addition  is  done  as  follows. 


A  +  B  = 


1  2  3 
1  0  4 


5  2  3 
-6  2  1 


1  +  5  2  +  2  3  +  3 
1  +  -6  0  +  2  4+1 


6  4  6 
-5  2  5 


* 

Addition  of  matrices  obeys  very  much  the  same  properties  as  normal  addition  with  numbers.  Note  that 
when  we  write  for  example  A+B  then  we  assume  that  both  matrices  are  of  equal  size  so  that  the  operation 
is  indeed  possible. 


Proof.  Consider  the  Commutative  Law  of  Addition  given  in  2.2.  Let  A,B,C,  and  D  be  matrices  such  that 
A-\-B  —  C  and  B  +  A  —  D.  We  want  to  show  that  D  =  C.  To  do  so,  we  will  use  the  definition  of  matrix 
addition  given  in  Definition  2.5.  Now, 

Q  j  —  @i j  +  bi  j  —  bj  j  +  £?;  j  =  dj  j 

Therefore,  C  —  D  because  the  i jth  entries  are  the  same  for  all  i  and  j.  Note  that  the  conclusion  follows 
from  the  commutative  law  of  addition  of  numbers,  which  says  that  if  a  and  b  are  two  numbers,  then 
a  +  b  —  b  +  a.  The  proof  of  the  other  results  are  similar,  and  are  left  as  an  exercise.  4k 

We  call  the  zero  matrix  in  2.4  the  additive  identity.  Similarly,  we  call  the  matrix  —A  in  2.5  the 
additive  inverse.  —A  is  defined  to  equal  (  —  1)  A  —  [ — «/y]  -  In  other  words,  every  entry  of  A  is  multiplied 
by  —  1 .  In  the  next  section  we  will  study  scalar  multiplication  in  more  depth  to  understand  what  is  meant 
by(-l)A. 
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2.1.2.  Scalar  Multiplication  of  Matrices 


Recall  that  we  use  the  word  scalar  when  referring  to  numbers.  Therefore,  scalar  multiplication  of  a  matrix 
is  the  multiplication  of  a  matrix  by  a  number.  To  illustrate  this  concept,  consider  the  following  example  in 
which  a  matrix  is  multiplied  by  the  scalar  3. 


'  1 

2 

3 

4  ' 

3 

6 

9 

12  ' 

5 

2 

8 

7 

= 

15 

6 

24 

21 

6 

-9 

1 

2 

18 

-27 

3 

6 

The  new  matrix  is  obtained  by  multiplying  every  entry  of  the  original  matrix  by  the  given  scalar. 
The  formal  definition  of  scalar  multiplication  is  as  follows. 


Consider  the  following  example. 


— 

Example  2.9:  Effect  of  Multiplication  by  a  Scalar 

Find  the  result  of  multiplying  the  following  me 

A  — 

itrix  A  by 

"2  O' 
1  -4 

7. 

Solution.  By  Definition  2.8,  we  multiply  each  element  of  A  by  7.  Therefore, 


2  0  ' 

*7(2) 

7(0)  * 

'14  O' 

1  -4 

7(1) 

7(— 4)  . 

7  -28 

Similarly  to  addition  of  matrices,  there  are  several  properties  of  scalar  multiplication  which  hold. 
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The  proof  of  this  proposition  is  similar  to  the  proof  of  Proposition  2.7  and  is  left  an  exercise  to  the 
reader. 

2.1.3.  Multiplication  of  Matrices 


The  next  important  matrix  operation  we  will  explore  is  multiplication  of  matrices.  The  operation  of  matrix 
multiplication  is  one  of  the  most  important  and  useful  of  the  matrix  operations.  Throughout  this  section, 
we  will  also  demonstrate  how  matrix  multiplication  relates  to  linear  systems  of  equations. 

First,  we  provide  a  formal  definition  of  row  and  column  vectors. 


We  may  simply  use  the  term  vector  throughout  this  text  to  refer  to  either  a  column  or  row  vector.  If 
we  do  so,  the  context  will  make  it  clear  which  we  are  referring  to. 
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In  this  chapter,  we  will  again  use  the  notion  of  linear  combination  of  vectors  as  in  Definition  9.12.  In 
this  context,  a  linear  combination  is  a  sum  consisting  of  vectors  multiplied  by  scalars.  For  example, 


50 

122 


+  9 


3 

6 


is  a  linear  combination  of  three  vectors. 

It  turns  out  that  we  can  express  any  system  of  linear  equations  as  a  linear  combination  of  vectors.  In 
fact,  the  vectors  that  we  will  use  are  just  the  columns  of  the  corresponding  augmented  matrix! 


Definition  2.12:  The  Vector  Form  of  a  System  of  Linear  Equations 


Suppose  we  have  a  system  of  equations  given  by 

011*1  H - F  a\nxn  =  b\ 


0/71 1*1  H - F  0/77/7*7/  -  bm 

We  can  express  this  system  in  vector  form  which  is  as  follows: 


Oil 

012 

01/7 

'  bx  ' 

*1 

021 

+  *2 

022 

+  • 

•  +*// 

02/7 

= 

b2 

0/77 1 

0/772 

O/77/7 

bm 

Notice  that  each  vector  used  here  is  one  column  from  the  corresponding  augmented  matrix.  There  is 
one  vector  for  each  variable  in  the  system,  along  with  the  constant  vector. 

The  first  important  form  of  matrix  multiplication  is  multiplying  a  matrix  by  a  vector.  Consider  the 
product  given  by 

7  ' 

8 

9  _ 

We  will  soon  see  that  this  equals 


1  2  3 
4  5  6 


7 


50 

122 


In  general  terms, 


*1 

0 11 

012 

013 

Oil 

+  *2 

012 

+  *3 

013 

*2 

=  *1 

_  021 

022 

023 

.  *3  . 

<32! 

.  Q22 

.  fl23 

<311*1  +012*2  +  013*3 
<321*1  +  022*2  +  023*3 


Thus  you  take  x\  times  the  first  column,  add  to  *2  times  the  second  column,  and  finally  *3  times  the  third 
column.  The  above  sum  is  a  linear  combination  of  the  columns  of  the  matrix.  When  you  multiply  a  matrix 


60  Matrices 


on  the  left  by  a  vector  on  the  right,  the  numbers  making  up  the  vector  are  just  the  scalars  to  be  used  in  the 
linear  combination  of  the  columns  as  illustrated  above. 

Here  is  the  formal  definition  of  how  to  multiply  an  m  x  n  matrix  by  an  n  x  1  column  vector. 


If  we  write  the  columns  of  A  in  terms  of  their  entries,  they  are  of  the  form 


aij 


a2j 


@ in  j 


Then,  we  can  write  the  product  AX  as 


a\\ 

«21 

+  X2 

<312 

«22 

+  ■ 

■  T  x>i 

Clin 

a2n 

1 

®m2 

Q-mn 

Note  that  multiplication  of  an  m  x  n  matrix  and  an  n  x  1  vector  produces  an  m  x  1  vector. 
Here  is  an  example. 


Solution.  We  will  use  Definition  2.13  to  compute  the  product.  Therefore,  we  compute  the  product  AX  as 
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follows. 


1 

2 

1 

3 

1 

0 

+  2 

2 

+  0 

1 

+  1 

-2 

2 

1 

4 

1 

1 

4 

0 

3 

0 

+ 

4 

+ 

0 

+ 

-2 

2 

2 

0 

1 

8 

2 

5 


Using  the  above  operation,  we  can  also  write  a  system  of  linear  equations  in  matrix  form.  In  this 
form,  we  express  the  system  as  a  matrix  multiplied  by  a  vector.  Consider  the  following  definition. 


Definition  2.15:  The  Matrix  Form  of  a  System  of  Linear  Equations 


Suppose  we  have  a  system  of  equations  given  by 

011*1  H - f  a  \nxn  =  b\ 

021*1  H - \-a2n*n  =  h 

CLm  1*1  H - 1-  —  bin 

Then  we  can  express  this  system  in  matrix  form  as  follows. 


an  a\2  ■ 

<321  «22  ' 

a\n 

a2  n 

*1 

*2 

b\ 

b2 

Q-m  1  ttm2 

amn 

*n 

h 

um 

The  expression  AX  —  B  is  also  known  as  the  Matrix  Form  of  the  corresponding  system  of  linear 
equations.  The  matrix  A  is  simply  the  coefficient  matrix  of  the  system,  the  vector  X  is  the  column  vector 
constructed  from  the  variables  of  the  system,  and  finally  the  vector  B  is  the  column  vector  constructed 
from  the  constants  of  the  system.  It  is  important  to  note  that  any  system  of  linear  equations  can  be  written 
in  this  form. 

Notice  that  if  we  write  a  homogeneous  system  of  equations  in  matrix  form,  it  would  have  the  form 
AX  =  0,  for  the  zero  vector  0. 

You  can  see  from  this  definition  that  a  vector 

*1 


*n 
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will  satisfy  the  equation  AX  =  B  only  when  the  entries  x\  ,X2,  ■  ■  •  ,xn  of  the  vector  X  are  solutions  to  the 
original  system. 

Now  that  we  have  examined  how  to  multiply  a  matrix  by  a  vector,  we  wish  to  consider  the  case  where 
we  multiply  two  matrices  of  more  general  sizes,  although  these  sizes  still  need  to  be  appropriate  as  we  will 
see.  For  example,  in  Example  2.14,  we  multiplied  a  3  x  4  matrix  by  a  4  x  1  vector.  We  want  to  investigate 
how  to  multiply  other  sizes  of  matrices. 

We  have  not  yet  given  any  conditions  on  when  matrix  multiplication  is  possible!  For  matrices  A  and 
B,  in  order  to  form  the  product  AB,  the  number  of  columns  of  A  must  equal  the  number  of  rows  of  B. 
Consider  a  product  AB  where  A  has  size  m  x  n  and  B  has  size  n  x  p.  Then,  the  product  in  terms  of  size  of 
matrices  is  given  by 

these  must  match! 

(m  x  n)  (nx  p  )  =  m  x  p 

Note  the  two  outside  numbers  give  the  size  of  the  product.  One  of  the  most  important  rules  regarding 
matrix  multiplication  is  the  following.  If  the  two  middle  numbers  don’t  match,  you  can’t  multiply  the 
matrices! 

When  the  number  of  columns  of  A  equals  the  number  of  rows  of  B  the  two  matrices  are  said  to  be 
conformable  and  the  product  AB  is  obtained  as  follows. 


Consider  the  following  example. 


r  ^ 

Example  2.17:  Multiplying  Two  Matrices 

FindAB  if  possible. 

a — r 1 2 1 1 B- 
A  [  0  2  1  J 

12  0' 
0  3  1 

-2  1  1 

Solution.  The  first  thing  you  need  to  verify  when  calculating  a  product  is  whether  the  multiplication  is 
possible.  The  first  matrix  has  size  2x3  and  the  second  matrix  has  size  3x3.  The  inside  numbers  are 
equal,  so  A  and  B  are  conformable  matrices.  According  to  the  above  discussion  AS  will  be  a  2  x  3  matrix. 
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Definition  2.16  gives  us  a  way  to  calculate  each  column  of  AB,  as  follows. 


First  column 

_ ✓Nn. 


1  2  1 
0  2  1 


Second  column 

_ /s _ 


Third  column 


1 

0 

-2 


1  2  1 
0  2  1 


2 

3 

1 


- 

1  V 

r  o  l 

1  2  1 

i 

0  2  1 

1 

i 

You  know  how  to  multiply  a  matrix  times  a  vector,  using  Definition  2.13  for  each  of  the  three  columns. 
Thus 

'12  1 
0  2  1 

* 


1  2  0 
0  3  1 
-2  1  1 


-19  3 
-2  7  3 


Since  vectors  are  simply  n  x  1  or  1  x  m  matrices,  we  can  also  multiply  a  vector  by  another  vector. 


Example  2.18:  Vector  Times  Vector  Multiplication 

Multiply  if  possible 

'  1  ' 

2 

1 

[1  2  1  0  ]  . 

Solution.  In  this  case  we  are  multiplying  a  matrix  of  size  3  x  1  by  a  matrix  of  size  1x4.  The  inside 
numbers  match  so  the  product  is  defined.  Note  that  the  product  will  be  a  matrix  of  size  3x4.  Using 
Definition  2.16,  we  can  compute  this  product  as  follows 


First  column  Second  column  Third  column  Fourth  column 


— 

1 

* - S  r 

'  1 " 

* - S  r 

'  1  " 

* - ,  ^ 

"  1  " 

-s - s 

2 

[1]. 

2 

[2]. 

2 

[1], 

2 

[0] 

1 

1 

1 

1 

You  can  use  Definition  2.13  to  verify  that  this  product  is 

"12  10" 
2  4  2  0 
12  10 


* 


Example  2.19:  A  Multiplication  Which  is  Not  Defined 


Find  BA  if  possible. 


1  2  0" 

B  = 

0  3  1 

,A  = 

'12  1' 

0  2  1 

-2  1  1 
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Solution.  First  check  if  it  is  possible.  This  product  is  of  the  form  (3  x  3)  (2  x  3) .  The  inside  numbers  do 
not  match  and  so  you  can’t  do  this  multiplication.  4k 

In  this  case,  we  say  that  the  multiplication  is  not  defined.  Notice  that  these  are  the  same  matrices  which 
we  used  in  Example  2.17.  In  this  example,  we  tried  to  calculate  BA  instead  of  AB.  This  demonstrates 
another  property  of  matrix  multiplication.  While  the  product  AB  maybe  be  defined,  we  cannot  assume 
that  the  product  BA  will  be  possible.  Therefore,  it  is  important  to  always  check  that  the  product  is  defined 
before  carrying  out  any  calculations. 

Earlier,  we  defined  the  zero  matrix  0  to  be  the  matrix  (of  appropriate  size)  containing  zeros  in  all 
entries.  Consider  the  following  example  for  multiplication  by  the  zero  matrix. 


Solution.  In  this  product,  we  compute 


"12' 

- 1 

o 

o 

i 

o 

1 

o 

3  4 

1 - 

o 

o 

1 

o 

o 

1 _ 

Hence,  AO  =  0. 


Notice  that  we  could  also  multiply  A  by  the  2x1  zero  vector  given  by 


0 

0 


.  The  result  would  be  the 


2x1  zero  vector.  Therefore,  it  is  always  the  case  that  AO  =  0,  for  an  appropriately  sized  zero  matrix  or 
vector. 


2.1.4.  The  ijth  Entry  of  a  Product 


In  previous  sections,  we  used  the  entries  of  a  matrix  to  describe  the  action  of  matrix  addition  and  scalar 
multiplication.  We  can  also  study  matrix  multiplication  using  the  entries  of  matrices. 

What  is  the  i  jth  entry  of  AB?  It  is  the  entry  in  the  ith  row  and  the  /"  column  of  the  product  AB. 

Now  if  A  is  in  x  n  and  B  is  n  x  p,  then  we  know  that  the  product  AB  has  the  form 


an 

an  ■ 

a\n 

'  b ii 

bn  ■ 

■  bij  • 

1 - 

an 

an  ■ 

a2n 

^21 

b22  ■ 

•  b2j  ■ 

•  b2p 

a?ni 

am2 

amn 

bni 

bn2 

bnj 

bnp 
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The  jth  column  of  AB  is  of  the  form 


an 

«12  • 

U 1  n 

'  bij  ' 

an 

ai2  ■ 

@2  n 

*2  j 

a  ml 

am2 

anm 

which  is  an  m  x  1  column  vector.  It  is  calculated  by 


au 

a 21 

+b2j 

Cl  12 

Cl  22 

+  ■ 

■  +  bnj 

Cl  In 

Cl2n 

ami 

Clm2 

Clmn 

Therefore,  the  i  jth  entry  is  the  entry  in  row  i  of  this  vector.  This  is  computed  by 

n 

di\b\ j  T  ^i'2^2 j  H - h  flinbnj  =  ^ik^kj 

k=  1 

The  following  is  the  formal  definition  for  the  i  jth  entry  of  a  product  of  matrices. 


In  other  words,  to  find  the  (/,  /)-cntry  of  the  product  AB,  or  (AB)jj,  you  multiply  the  ith  row  of  A,  on 
the  left  by  the  /"  column  of  B.  To  express  AB  in  terms  of  its  entries,  we  write  AB  —  [(AB),y] . 

Consider  the  following  example. 
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Solution.  First  check  if  the  product  is  possible.  It  is  of  the  form  (3  x  2)  (2  x  3)  and  since  the  inside 
numbers  match,  it  is  possible  to  do  the  multiplication.  The  result  should  be  a  3  x  3  matrix.  We  can  first 
compute  AB: 


1  2 
3  1 
2  6 


2 

7 


1  2 
3  1 
2  6 


3 

6 


1  2 
3  1 
2  6 


1 

2 


where  the  commas  separate  the  columns  in  the  resulting  product.  Thus  the  above  product  equals 


16  15  5 

13  15  5 

46  42  14 


which  is  a  3  x  3  matrix  as  desired.  Thus,  the  (3, 2) -entry  equals  42. 
Now  using  Definition  2.21,  we  can  find  that  the  (3,2)-entry  equals 

2 

=  ^31^12  +  ^32^22 

k=  1 

=  2x3  +  6x6  =  42 


Consulting  our  result  for  AB  above,  this  is  correct! 

You  may  wish  to  use  this  method  to  verify  that  the  rest  of  the  entries  in  AB  are  correct. 

Here  is  another  example. 


Example  2.23:  Finding  the  Entries  of  a  Product 


Determine  if  the  product  AB  is  defined.  If  it  is,  find  the  (2, 1 ) -entry  of  the  product. 


'231' 

'12' 

A  = 

7  6  2 

,B  = 

3  1 

0  0  0 

2  6 

Solution.  This  product  is  of  the  form  (3  x  3)  (3  x  2).  The  middle  numbers  match  so  the  matrices  are 
conformable  and  it  is  possible  to  compute  the  product. 

We  want  to  find  the  (2, 1) -entry  of  AB,  that  is,  the  entry  in  the  second  row  and  first  column  of  the 
product.  We  will  use  Definition  2.21,  which  states 

n 

(AB) i j  aik^kj 

k=  1 


In  this  case,  n  =  3,  i  =  2  and  /  =  1 .  Hence  the  (2,  l)-entry  is  found  by  computing 


3 

(AB) 21  =  ^  a2kbk\  —  [  <221  a22  <223  ] 
k=  1 


bn 

bn 

b?,\ 
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Substituting  in  the  appropriate  values,  this  product  becomes 


[  «21 


«22  ^23  ] 

1 

to  1— 

=  [?  «  2] 

'  1  " 

3 

£31 

- 1 

(N 

_ 1 

1x7  +  3x6  +  2x2  =  29 


Hence,  (AB) 21  —  29. 

You  should  take  a  moment  to  find  a  few  other  entries  of  AB.  You  can  multiply  the  matrices  to  check 
that  your  answers  are  correct.  The  product  AB  is  given  by 


AB  = 


13  13 
29  32 
0  0 


* 


2.1.5.  Properties  of  Matrix  Multiplication 


As  pointed  out  above,  it  is  sometimes  possible  to  multiply  matrices  in  one  order  but  not  in  the  other  order. 
However,  even  if  both  AB  and  BA  are  defined,  they  may  not  be  equal. 


Example  2.24:  Matrix  Multiplication  is  Not  Commutative 

Compare  the  products  AB  and  BA,  for  matrices  A  — 

'12' 
3  4 

'01' 
1  0 

Solution.  First,  notice  that  A  and  B  are  both  of  size  2x2.  Therefore,  both  products  AB  and  BA  are  defined. 
The  first  product,  AB  is 


'12' 

'  0 

1  ' 

'  2 

1  ' 

3  4 

1 

0 

4 

3 

The  second  product,  BA  is 


1 

O 

l _ 

'12' 

'34' 

1  0 

3  4 

1  2 

Therefore,  AB  ^  BA.  4jfc 


This  example  illustrates  that  you  cannot  assume  AB  —  BA  even  when  multiplication  is  defined  in  both 
orders.  If  for  some  matrices  A  and  B  it  is  true  that  AB  —  BA,  then  we  say  that  A  and  B  commute.  This  is 
one  important  property  of  matrix  multiplication. 


The  following  are  other  important  properties  of  matrix  multiplication.  Notice  that  these  properties  hold 
only  when  the  size  of  matrices  are  such  that  the  products  are  defined. 


68  Matrices 


Proof.  First  we  will  prove  2.6.  We  will  use  Definition  2.21  and  prove  this  statement  using  the  i jth  entries 
of  a  matrix.  Therefore, 

(A  (rB  +  sC))^  =  Y^aik  ( rB  +  sC)kj  =  Y^aik  ( rbkj  +  sckj) 
k  k 

=  rY^aikhj  +  s  ^aikckj  =  r  ( AB )jj  +  5  (AC);;. 

k  k 

=  ( r{AB)  +  s{AC))i] 

Thus  A  ( rB  +  sC )  =  r{AB )  +  s(AC)  as  claimed. 

The  proof  of  2.7  follows  the  same  pattern  and  is  left  as  an  exercise. 

Statement  2.8  is  the  associative  law  of  multiplication.  Using  Definition  2.21, 

(A  (BC))U  =  (BC)kj  =  £>»£>, cy 

k  k  l 

=  YJ^B)uclj  =  ((AB)C)ij. 

I 

This  proves  2.8.  4 

2.1.6.  The  Transpose 


Another  important  operation  on  matrices  is  that  of  taking  the  transpose.  For  a  matrix  A,  we  denote  the 
transpose  of  A  by  A7 .  Before  formally  defining  the  transpose,  we  explore  this  operation  on  the  following 
matrix. 

'  1  4 
3  1 

.  2  6 

What  happened?  The  first  column  became  the  first  row  and  the  second  column  became  the  second  row. 
Thus  the  3x2  matrix  became  a  2  x  3  matrix.  The  number  4  was  in  the  first  row  and  the  second  column 
and  it  ended  up  in  the  second  row  and  first  column. 

The  definition  of  the  transpose  is  as  follows. 


1 

'  1 

3  2  ' 

4 

1  6 
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The  ntry  of  A  becomes  the  (  /',  /) -entry  of  A7. 
Consider  the  following  example. 


Example  2.27:  The  Transpose  of  a  Matrix 

Calculate  A1  for  the  following  matrix 

A  = 

"  1  2  -6  ' 
3  5  4 

Solution.  By  Definition  2.26,  we  know  that  for  A  =  \aLj\ ,  A7  =  \ajj\  .  In  other  words,  we  switch  the  row 
and  column  location  of  each  entry.  The  (1,2) -entry  becomes  the  (2, 1) -entry. 


Thus, 


At  — 


1  3 

2  5 
-6  4 


Notice  that  A  is  a  2  x  3  matrix,  while  Ar  is  a  3  x  2  matrix. 


* 


The  transpose  of  a  matrix  has  the  following  important  properties  . 


Proof.  First  we  prove  2.  From  Definition  2.26, 

(AB)t  =  [(AB)ij]T  =  [(. AB )ji\  £ djkbki  £  bkiajk 

k  k 

=  £  \bik]T  \akj\  T  -  [bij\  '  [aij\  T  =  BTAT 

k 

The  proof  of  Formula  3  is  left  as  an  exercise. 


The  transpose  of  a  matrix  is  related  to  other  important  topics.  Consider  the  following  definition. 


70  Matrices 


Definition  2.29:  Symmetric  and  Skew  Symmetric  Matrices 


An  n  x  n  matrix  A  is  said  to  be  symmetric  if A  —AT  .It  is  said  to  be  skew  symmetric  if  A  =  — A 7 . 


We  will  explore  these  definitions  in  the  following  examples. 


Solution.  By  Definition  2.29,  we  need  to  show  that  A—  Ar .  Now,  using  Definition  2.26, 


At  — 


2 

1 

3 


1  3 

5  -3 
-3  7 


Hence,  A  =  AT ,  so  A  is  symmetric. 


* 


Solution.  By  Definition  2.29, 


-3 

-2 

0 


You  can  see  that  each  entry  of  Ar  is  equal  to 
by  Definition  2.29,  A  is  skew  symmetric. 


times  the  same  entry  of  A.  Hence,  Ar  —  —A  and  so 

* 


2.1.  Matrix  Arithmetic  7 1 


2.1.7.  The  Identity  and  Inverses 


There  is  a  special  matrix,  denoted  /,  which  is  called  to  as  the  identity  matrix.  The  identity  matrix  is 
always  a  square  matrix,  and  it  has  the  property  that  there  are  ones  down  the  main  diagonal  and  zeroes 
elsewhere.  Here  are  some  identity  matrices  of  various  sizes. 


[1]. 


1  0 
0  1 


1  0  0 
0  1  0 
0  0  1 


10  0  0 
0  10  0 
0  0  10 
0  0  0  1 


The  first  is  the  lxl  identity  matrix,  the  second  is  the  2x2  identity  matrix,  and  so  on.  By  extension,  you 
can  likely  see  what  the  nxn  identity  matrix  would  be.  When  it  is  necessary  to  distinguish  which  size  of 
identity  matrix  is  being  discussed,  we  will  use  the  notation  In  for  the  nxn  identity  matrix. 

The  identity  matrix  is  so  important  that  there  is  a  special  symbol  to  denote  the  i  jth  entry  of  the  identity 
matrix.  This  symbol  is  given  by  /(/  =  <5(/  where  8jj  is  the  Kronecker  symbol  defined  by 


1  if  i  =  j 
0  if  j 


In  is  called  the  identity  matrix  because  it  is  a  multiplicative  identity  in  the  following  sense. 


Lemma  2.32:  Multiplication  by  the  Identity  Matrix 


Suppose  A  is  an  m  x  n  matrix  and  In  is  the  nxn  identity  matrix.  Then  AIn  =  A.  If  Im  is  the  m  x  m 
identity  matrix,  it  also  follows  that  ImA  =  A. 


Proof.  The  (/,  j) -entry  of  AIn  is  given  by: 

@ik  8kj  U'lj 

k 

and  so  AIn  —  A.  The  other  case  is  left  as  an  exercise  for  you.  4k 

We  now  define  the  matrix  operation  which  in  some  ways  plays  the  role  of  division. 


Such  a  matrix  A-1  will  have  the  same  size  as  the  matrix  A.  It  is  very  important  to  observe  that  the 
inverse  of  a  matrix,  if  it  exists,  is  unique.  Another  way  to  think  of  this  is  that  if  it  acts  like  the  inverse,  then 
it  is  the  inverse. 
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Theorem  2.34:  Uniqueness  of  Inverse 


Suppose  A  is  an  n  x  n  matrix  such  that  an  inverse  A  1  exists.  Then  there  is  only  one  such  inverse 
matrix.  That  is,  given  any  matrix  B  such  thatAB  =  BA  =  I,  B  =  A  1 . 


Proof.  In  this  proof,  it  is  assumed  that  I  is  the  n  x  n  identity  matrix.  Let  A,B  be  n  x  n  matrices  such  that 
A”1  exists  and  AB  —  BA  =  I.  We  want  to  show  that  A-1  =  B.  Now  using  properties  we  have  seen,  we  get: 

A"1  =A~1/=A~1  (AB)  =  (A-1A)  B  —  IB  —  B 

Hence,  A-1  =  B  which  tells  us  that  the  inverse  is  unique.  4k 

The  next  example  demonstrates  how  to  check  the  inverse  of  a  matrix. 


Example  2.35:  Verifying  the  Inverse  of  a  Matrix 

Let  A  — 

'  1  1  ' 
1  2 

.  Show 

2  -1  ' 
-1  1 

is  the  inverse  of  A. 

Solution.  To  check  this,  multiply 


'll' 

2 

-1  ' 

'  1 

0  ' 

1  2 

- 

-1 

1 

0 

1 

2  - 

1  ' 

'  1 

1  ' 

'  1 

0  ' 

-1 

1 

1 

2 

0 

1 

showing  that  this  matrix  is  indeed  the  inverse  of  A.  4k 

Unlike  ordinary  multiplication  of  numbers,  it  can  happen  that  A  ^  0  but  A  may  fail  to  have  an  inverse. 
This  is  illustrated  in  the  following  example. 


Example  2.36:  A  Nonzero  Matrix  With  No  Inverse 

Let  A  — 

'  1  1  ' 
1  1 

.  Show  that  A  does  not  have  an  inverse. 

Solution.  One  might  think  A  would  have  an  inverse  because  it  does  not  equal  zero.  However,  note  that 


'  1 

1  ' 

"  -1  ' 

'  0  ' 

1 

1 

1 

0 

If  A  1  existed,  we  would  have  the  following 
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This  says  that 


=  (A-'A) 


-1 

1 


'  0  ' 

'  -1  ' 

0 

1 

which  is  impossible!  Therefore,  A  does  not  have  an  inverse. 


In  the  next  section,  we  will  explore  how  to  find  the  inverse  of  a  matrix,  if  it  exists. 


2.1.8.  Finding  the  Inverse  of  a  Matrix 


* 


In  Example  2.35,  we  were  given  A  1  and  asked  to  verify  that  this  matrix  was  in  fact  the  inverse  of  A.  In 
this  section,  we  explore  how  to  find  A^1. 

Let 


as  in  Example  2.35.  In  order  to  find  A  ,  we  need  to  find  a  matrix 


x  z 
y  w 


such  that 


'll' 

x  z 

'10' 

1  2 

y  w 

0  1 

We  can  multiply  these  two  matrices,  and  see  that  in  order  for  this  equation  to  be  true,  we  must  find  the 
solution  to  the  systems  of  equations, 

x  +  y  —  1 
x  +  2y  —  0 

and 

z  +  w  —  0 
z  +  2w  —  1 

Writing  the  augmented  matrix  for  these  two  systems  gives 


for  the  first  system  and 


'  1 

1 

1  ' 

1 

2 

0 

'  1 

1 

0  ' 

1 

2 

1 

(2.9) 


for  the  second. 
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Let’s  solve  the  first  system.  Take  —1  times  the  first  row  and  add  to  the  second  to  get 


'  1  1 

1  ' 

0  1 

-1 

Now  take  —1  times  the  second  row  and  add  to  the  first  to  get 


"  1  0 

2  ' 

0  1 

-1 

Writing  in  terms  of  variables,  this  says  x  —  2  and  y  =  —  1 . 

Now  solve  the  second  system,  2.9  to  find  z  and  w.  You  will  find  that  z  =  —  1  and  w  —  1. 

If  we  take  the  values  found  for  x,y,z,  and  w  and  put  them  into  our  inverse  matrix,  we  see  that  the 
inverse  is 


z 

2 

-1  ' 

.  y 

w 

-1 

1 

After  taking  the  time  to  solve  the  second  system,  you  may  have  noticed  that  exactly  the  same  row 
operations  were  used  to  solve  both  systems.  In  each  case,  the  end  result  was  something  of  the  form  [7|X] 
where  I  is  the  identity  and  X  gave  a  column  of  the  inverse.  In  the  above, 

x 

_  y  _ 

the  first  column  of  the  inverse  was  obtained  by  solving  the  first  system  and  then  the  second  column 


z 

w 


To  simplify  this  procedure,  we  could  have  solved  both  systems  at  once!  To  do  so,  we  could  have 
written 


'  1  1 

1  0  ' 

1  2 

0  1 

and  row  reduced  until  we  obtained 


'  1 

0 

2  -1  ' 

0 

1 

-1  1 

and  read  off  the  inverse  as  the  2x2  matrix  on  the  right  side. 

This  exploration  motivates  the  following  important  algorithm. 


Algorithm  2.37:  Matrix  Inverse  Algorithm 


Suppose  A  is  an  n  x  n  matrix.  To  find  A  1  if  it  exists,  form  the  augmented  n  x  2  n  matrix 

m 

If  possible  do  row  operations  until  you  obtain  annx  2  n  matrix  of  the  form 

m 

When  this  has  been  done,  B  =  A-1.  In  this  case,  we  say  that  A  is  invertible.  If  it  is  impossible  to 
row  reduce  to  a  matrix  of  the  form  [I\B],  then  A  has  no  inverse. 
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This  algorithm  shows  how  to  find  the  inverse  if  it  exists.  It  will  also  tell  you  if  A  does  not  have  an 
inverse. 


Consider  the  following  example. 


Example  2.38:  Finding  the  Inverse 

Let  A  — 

'12  2  ' 
1  0  2 

3  1  -1 

.  Find  A  1  if  it  exists. 

Solution.  Set  up  the  augmented  matrix 


m 


"  1 

2 

2 

1 

0 

0  ' 

1 

0 

2 

0 

1 

0 

3 

1 

-1 

0 

0 

1 

Now  we  row  reduce,  with  the  goal  of  obtaining  the  3x3  identity  matrix  on  the  left  hand  side.  First, 
take  —1  times  the  first  row  and  add  to  the  second  followed  by  —3  times  the  first  row  added  to  the  third 
row.  This  yields 


"  1 

2 

2 

1 

0 

0  ' 

0 

-2 

0 

-1 

1 

0 

0 

-5 

-7 

-3 

0 

1 

Then  take  5  times  the  second  row  and  add  to  -2  times  the  third  row. 


"  1 

2 

2 

1 

0 

0  ' 

0 

-10 

0 

-5 

5 

0 

0 

0 

14 

1 

5 

-2 

Next  take  the  third  row  and  add  to  —7  times  the  first  row.  This  yields 


'  -7 

-14 

0 

-6 

5 

-2  ' 

0 

-10 

0 

-5 

5 

0 

0 

0 

14 

1 

5 

-2 

n 

Now  take  —  l  times  the  second  row  and  add  to  the  first  row. 


"  -7 

0 

0 

1 

-2 

-2  ' 

0 

-10 

0 

-5 

5 

0 

0 

0 

14 

1 

5 

-2 

Finally  divide  the  first  row  by  -7,  the  second  row  by  -10  and  the  third  row  by  14  which  yields 


"  1 

0 

0 

1 

7 

2 

7 

2  1 

7 

0 

1 

0 

1 

2 

1 

2 

0 

0 

0 

1 

1 

14 

5 

14 

1 

7 
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Notice  that  the  left  hand  side  of  this  matrix  is  now  the  3x3  identity  matrix  1 3.  Therefore,  the  inverse  is 
the  3x3  matrix  on  the  right  hand  side,  given  by 


_  1 

7 

1 

2 

J_ 

14 


2 

7 

_  1 
2 

JL 

14 


It  may  happen  that  through  this  algorithm,  you  discover  that  the  left  hand  side  cannot  be  row  reduced 
to  the  identity  matrix.  Consider  the  following  example  of  this  situation. 


Example  2.39:  A  Matrix  Which  Has  No  Inverse 

Let  A  — 

'12  2' 
1  0  2 

2  2  4 

.  Find  A  1  if  it  exists. 

Solution.  Write  the  augmented  matrix  [A|7] 


'  1 

2 

2 

1 

0 

0  " 

1 

0 

2 

0 

1 

0 

2 

2 

4 

0 

0 

1 

and  proceed  to  do  row  operations  attempting  to  obtain  [l\A  r|  .  Take  —1  times  the  first  row  and  add  to  the 
second.  Then  take  —2  times  the  first  row  and  add  to  the  third  row. 


'  1 

2 

2 

1 

0 

0  ' 

0 

-2 

0 

-1 

1 

0 

0 

-2 

0 

-2 

0 

1 

Next  add  —  1  times  the  second  row  to  the  third  row. 


'  1 

2 

2 

1  0 

0  ' 

0 

-2 

0 

-1  1 

0 

0 

0 

0 

-1  -1 

1 

At  this  point,  you  can  see  there  will  be  no  way  to  obtain  I  on  the  left  side  of  this  augmented  matrix.  Hence, 
there  is  no  way  to  complete  this  algorithm,  and  therefore  the  inverse  of  A  does  not  exist.  In  this  case,  we 
say  that  A  is  not  invertible.  4k 

If  the  algorithm  provides  an  inverse  for  the  original  matrix,  it  is  always  possible  to  check  your  answer. 
To  do  so,  use  the  method  demonstrated  in  Example  2.35.  Check  that  the  products  AA  1  and  A-1A  both 
equal  the  identity  matrix.  Through  this  method,  you  can  always  be  sure  that  you  have  calculated  A”1 
properly ! 

One  way  in  which  the  inverse  of  a  matrix  is  useful  is  to  find  the  solution  of  a  system  of  linear  equations. 
Recall  from  Definition  2.15  that  we  can  write  a  system  of  equations  in  matrix  form,  which  is  of  the  form 
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AX  =  5.  Suppose  you  find  the  inverse  of  the  matrix  A  1 .  Then  you  could  multiply  both  sides  of  this 
equation  on  the  left  by  A  1  and  simplify  to  obtain 

(A-1)  AX'  =A~lB 
(A_1A)X  =  A~lB 

IX  =A~lB 

X  =A~lB 

Therefore  we  can  find  X,  the  solution  to  the  system,  by  computing  X  =  A  1 B.  Note  that  once  you  have 
found  A-1,  you  can  easily  get  the  solution  for  different  right  hand  sides  (different  B).  It  is  always  just 

a-!b. 

We  will  explore  this  method  of  finding  the  solution  to  a  system  in  the  following  example. 


Solution.  First,  we  can  write  the  system  of  equations  in  matrix  form 


"  1  0  1  ' 

X 

'  1 ' 

1  -1  1 

y 

= 

3 

1  1  -1 

z 

2 

=  B 


The  inverse  of  the  matrix 


is 


A  = 


1  0  1 

1  -1  1 

1  1  -1 


A”1 


0  \ 
1  -1 


Verifying  this  inverse  is  left  as  an  exercise. 


From  here,  the  solution  to  the  given  system  2.10  is  found  by 


X 

r  o  i  i  i 

u  2  2 

"  1 ' 

r  5  i 
2 

1 

pa  ts? 

=  A lB  = 

1  I 

1 

Mh-  O 

3 

2 

= 

-2 

3 

2 

(2.10) 


* 
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What  if  the  right  side,  B,  of  2. 10  had  been 


1  0 
1  -1 

1  1 


0 

1 

_  3 

1 

1 

-1 


?  In  other  words,  what  would  be  the  solution  to 


X 

"  0  ' 

y 

z 

— 

1 

3 

By  the  above  discussion,  the  solution  is  given  by 


r  o  i  ii 

u  2  2 

'  0  ' 

2  ' 

y 

z 

—  AlB  — 

1 

1  1 
M*—  H— 1 

1 

O 

1 

3 

= 

-1 

-2 

This  illustrates  that  for  a  system  AX  —  B  where  A  1  exists,  it  is  easy  to  find  the  solution  when  the  vector 
B  is  changed. 

We  conclude  this  section  with  some  important  properties  of  the  inverse. 


Consider  the  following  theorem. 
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2.1.9.  Elementary  Matrices 


We  now  turn  our  attention  to  a  special  type  of  matrix  called  an  elementary  matrix.  An  elementary  matrix 
is  always  a  square  matrix.  Recall  the  row  operations  given  in  Definition  1.11.  Any  elementary  matrix, 
which  we  often  denote  by  E,  is  obtained  from  applying  one  row  operation  to  the  identity  matrix  of  the 
same  size. 

For  example,  the  matrix 


is  the  elementary  matrix  obtained  from  switching  the  two  rows.  The  matrix 


E 


1  0  0 
0  3  0 
0  0  1 


is  the  elementary  matrix  obtained  from  multiplying  the  second  row  of  the  3x3  identity  matrix  by  3.  The 
matrix 

E=\  1  °1 

-3  1 

is  the  elementary  matrix  obtained  from  adding  —3  times  the  first  row  to  the  third  row. 

You  may  construct  an  elementary  matrix  from  any  row  operation,  but  remember  that  you  can  only 
apply  one  operation. 

Consider  the  following  definition. 


Definition  2.43:  Elementary  Matrices  and  Row  Operations 


Let  E  be  an  n  x  n  matrix.  Then  E  is  an  elementary  matrix  if  it  is  the  result  of  applying  one  row 
operation  to  the  n  x  n  identity  matrix  In. 

Those  which  involve  switching  rows  of  the  identity  matrix  are  called  permutation  matrices. 


Therefore,  E  constructed  above  by  switching  the  two  rows  of  h  is  called  a  permutation  matrix. 

Elementary  matrices  can  be  used  in  place  of  row  operations  and  therefore  are  very  useful.  It  turns  out 
that  multiplying  (on  the  left  hand  side)  by  an  elementary  matrix  E  will  have  the  same  effect  as  doing  the 
row  operation  used  to  obtain  E. 

The  following  theorem  is  an  important  result  which  we  will  use  throughout  this  text. 


Theorem  2.44:  Multiplication  by  an  Elementary  Matrix  and  Row  Operations 


To  perform  any  of  the  three  row  operations  on  a  matrix  A  it  suffices  to  take  the  product  EA,  where 
E  is  the  elementary  matrix  obtained  by  using  the  desired  row  operation  on  the  identity  matrix. 


Therefore,  instead  of  performing  row  operations  on  a  matrix  A,  we  can  row  reduce  through  matrix 
multiplication  with  the  appropriate  elementary  matrix.  We  will  examine  this  theorem  in  detail  for  each  of 
the  three  row  operations  given  in  Definition  1.11. 
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First,  consider  the  following  lemma. 


Lemma  2.45:  Action  of  Permutation  Matrix 


Let  P'i  denote  the  elementary  matrix  which  involves  switching  the  ith  and  the  jth  rows.  Then  Plj  is 
a  permutation  matrix  and 

PijA  =  B 

where  B  is  obtained  from  A  by  switching  the  ith  and  the  jth  rows. 


We  will  explore  this  idea  more  in  the  following  example. 


Solution.  You  can  see  that  the  matrix  P 12  is  obtained  by  switching  the  first  and  second  rows  of  the  3x3 
identity  matrix  I. 

Using  our  usual  procedure,  compute  the  product  PnA  =  B.  The  result  is  given  by 


B  = 


8  d 

a  b 

e  f 


Notice  that  B  is  the  matrix  obtained  by  switching  rows  1  and  2  of  A.  Therefore  by  multiplying  A  by  P12, 
the  row  operation  which  was  applied  to  /  to  obtain  Pn  is  applied  to  A  to  obtain  B.  4k 

Theorem  2.44  applies  to  all  three  row  operations,  and  we  now  look  at  the  row  operation  of  multiplying 
a  row  by  a  scalar.  Consider  the  following  lemma. 


Lemma  2.47:  Multiplication  by  a  Scalar  and  Elementary  Matrices 


Let  E  ( k ,  i)  denote  the  elementary  matrix  corresponding  to  the  row  operation  in  which  the  ith  row  is 
multiplied  by  the  nonzero  scalar,  k.  Then 

E  (k,i)A  =  B 

where  B  is  obtained  from  A  by  multiplying  the  ith  row  of  A  by  k. 


We  will  explore  this  lemma  further  in  the  following  example. 
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Solution.  You  can  see  that  E  (5,2)  is  obtained  by  multiplying  the  second  row  of  the  identity  matrix  by  5. 

Using  our  usual  procedure  for  multiplication  of  matrices,  we  can  compute  the  product  E  (5, 2)  A.  The 
resulting  matrix  is  given  by 

a  b 
B  —  5c  5d 
e  f 

Notice  that  B  is  obtained  by  multiplying  the  second  row  of  A  by  the  scalar  5.  4k 

There  is  one  last  row  operation  to  consider.  The  following  lemma  discusses  the  final  operation  of 
adding  a  multiple  of  a  row  to  another  row. 


Lemma  2.49:  Adding  Multiples  of  Rows  and  Elementary  Matrices 


Let  E  (k  x  i  +  j )  denote  the  elementary  matrix  obtained  from  I  by  adding  k  times  the  ith  row  to  the 
jth.  Then 

E  (kx  i  +  j)A  —  B 

where  B  is  obtained  from  A  by  adding  k  times  the  ith  row  to  the  jth  row  of  A. 


Consider  the  following  example. 


Solution.  You  can  see  that  the  matrix  E  (2  x  1+3)  was  obtained  by  adding  2  times  the  first  row  of  7  to  the 
third  row  of  I . 


Using  our  usual  procedure,  we  can  compute  the  product  E  (2  x  1  +  3)  A. 
given  by 


B  = 


a 

c 


b 

d 


The  resulting  matrix  B  is 


2 a  +  e  2  b  +  f 
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You  can  see  that  B  is  the  matrix  obtained  by  adding  2  times  the  first  row  of  A  to  the  third  row.  4k 

Suppose  we  have  applied  a  row  operation  to  a  matrix  A.  Consider  the  row  operation  required  to  return 
A  to  its  original  form,  to  undo  the  row  operation.  It  turns  out  that  this  action  is  how  we  find  the  inverse  of 
an  elementary  matrix  E. 

Consider  the  following  theorem. 


Theorem  2.51:  Elementary  Matrices  and  Inverses 


Every  elementary  matrix  is  invertible  and  its  inverse  is  also  an  elementary  matrix. 


In  fact,  the  inverse  of  an  elementary  matrix  is  constructed  by  doing  the  reverse  row  operation  on  I. 
E  1  will  be  obtained  by  performing  the  row  operation  which  would  carry  E  back  to  /. 

•  If  E  is  obtained  by  switching  rows  i  and  j,  then  E  1  is  also  obtained  by  switching  rows  i  and  j. 

•  If  E  is  obtained  by  multiplying  row  i  by  the  scalar  k,  then  E~l  is  obtained  by  multiplying  row  i  by 
the  scalar  }. 

k 

•  If  E  is  obtained  by  adding  k  times  row  i  to  row  j,  then  E  1  is  obtained  by  subtracting  k  times  row  i 
from  row  j. 

Consider  the  following  example. 


Solution.  Consider  the  elementary  matrix  E  given  by 


E  = 


1 

0 


0 

2 


Here,  E  is  obtained  from  the  2x2  identity  matrix  by  multiplying  the  second  row  by  2.  In  order  to  carry  E 
back  to  the  identity,  we  need  to  multiply  the  second  row  of  E  by  Hence,  E~l  is  given  by 


We  can  verify  that  EE  1  =  /.  Take  the  product  EE  1 ,  given  by 


1  _ 

"10' 

"  1  o' 

"10" 

0  2 

o  \ 

0  1 

EE 
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This  equals  I  so  we  know  that  we  have  compute  E  1  properly.  4 

Suppose  an  m  x  n  matrix  A  is  row  reduced  to  its  reduced  row-echelon  form.  By  tracking  each  row 
operation  completed,  this  row  reduction  can  be  completed  through  multiplication  by  elementary  matrices. 
Consider  the  following  definition. 


Definition  2.53:  The  Form  B  =  U A 


Let  A  be  an  m  x  n  matrix  and  let  B  be  the  reduced  row-echelon  form  of  A.  Then  we  can  write 
B  =  UA  where  U  is  the  product  of  all  elementary  matrices  representing  the  row  operations  done  to 
A  to  obtain  B. 


Consider  the  following  example. 


Example  2.54:  The  Form  B  =  UA 

Let  A  — 

'01' 
1  0 

2  0 

.  Find  B,  the  reduced  row-echelon  form  of  A  and  write  it  in  the  form  B  =  UA. 

Solution.  To  find  B,  row  reduce  A.  For  each  step,  we  will  record  the  appropriate  elementary  matrix.  First, 
switch  rows  1  and  2. 


"  0 

1  ' 

'  1 

0  ' 

1 

0 

-+ 

0 

1 

2 

0 

2 

0 

The  resulting  matrix  is  equivalent  to  finding  the  product  of  P11  = 


Next,  add  (—2)  times  row  1  to  row  3. 


0  1  0 
1  0  0 
0  0  1 


and  A. 


"  1 

0  ' 

"  1 

0  ' 

0 

1 

-+ 

0 

1 

2 

0 

0 

0 

This  is  equivalent  to  multiplying  by  the  matrix  E(— 2  x  1  +  3) 
resulting  matrix  is  B,  the  required  reduced  row-echelon  form  of  A. 


1  0  0 

0  1  0 

-2  0  1 


Notice  that  the 


We  can  then  write 


B  =  £(-2x  1  +  2)  (F12A) 
=  (£(— 2x  1  +  2)P12)A 
=  UA 


It  remains  to  find  the  matrix  U. 

U  =  E(—2  x  1  +2)P12 
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1  0  0 

0  1  0 

-2  0  1 

0  1  0 

1  0  0 

0  -2  1 


0  1  0 
1  0  0 
0  0  1 


We  can  verify  that  B  =  UA  holds  for  this  matrix  U : 


i 

o 

o 

'01' 

UA  = 

1  0  0 

1  0 

0  -2  1 

2  0 

1  0 

=  0  1 

.  0  0 

=  B 


* 

While  the  process  used  in  the  above  example  is  reliable  and  simple  when  only  a  few  row  operations 
are  used,  it  becomes  cumbersome  in  a  case  where  many  row  operations  are  needed  to  carry  A  to  B.  The 
following  theorem  provides  an  alternate  way  to  find  the  matrix  U . 


Theorem  2.55:  Finding  the  Matrix  U 


Let  A  be  an  m  x  n  matrix  and  let  B  be  its  reduced  row-echelon  form.  Then  B  —  UA  where  U  is  an 
invertible  m  x  m  matrix  found  by  forming  the  matrix  [A\Im]  and  row  reducing  to  [. B\U ]. 


Let’s  revisit  the  above  example  using  the  process  outlined  in  Theorem  2.55. 


Example  2.56:  The  Form  B  =  UA,  Revisited 

Let  A  — 

'01' 
1  0 

2  0 

.  Using  the  process  outlined  in  Theorem  2.55,  find  U  such  thatB  =  UA. 

Solution.  First,  set  up  the  matrix  [A\Im\. 


'  0 

1 

1 

0 

0  ' 

1 

0 

0 

1 

0 

2 

0 

0 

0 

1 

Now,  row  reduce  this  matrix  until  the  left  side  equals  the  reduced  row-echelon  form  of  A. 


"  0 

1 

1 

0 

0  ' 

'  1 

0 

0 

1 

0  ' 

1 

0 

0 

1 

0 

-A 

0 

1 

1 

0 

0 

2 

0 

0 

0 

1 

2 

0 

0 

0 

1 
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'  1 

0 

0 

1 

0  ' 

0 

1 

1 

0 

0 

0 

0 

0 

-2 

1 

The  left  side  of  this  matrix  is  B,  and  the  right  side  is  U.  Comparing  this  to  the  matrix  U  found  above 
in  Example  2.54,  you  can  see  that  the  same  matrix  is  obtained  regardless  of  which  process  is  used.  4k 

Recall  from  Algorithm  2.37  that  an  n  x  n  matrix  A  is  invertible  if  and  only  if  A  can  be  carried  to  the 
n  x  n  identity  matrix  using  the  usual  row  operations.  This  leads  to  an  important  consequence  related  to  the 
above  discussion. 

Suppose  A  is  an  n  x  n  invertible  matrix.  Then,  set  up  the  matrix  [A\In]  as  done  above,  and  row  reduce 
until  it  is  of  the  form  [B\U].  In  this  case,  B  —  I„  because  A  is  invertible. 


B  =  UA 
In  =  UA 
U~l  =  A 

Now  suppose  that  U  =  E1E2  ■  ■  ■  where  each  E,  is  an  elementary  matrix  representing  a  row  operation 
used  to  carry  A  to  I.  Then, 

U~l  =  {ElE1---Ek)~l  =E^-..E~xEl-\ 

Remember  that  if  Ej  is  an  elementary  matrix,  so  too  is  EJ  1 .  It  follows  that 

A  =  U~l 

=  E-] -..E-]E\-\ 

and  A  can  be  written  as  a  product  of  elementary  matrices. 


Theorem  2.57:  Product  of  Elementary  Matrices 


Let  A  be  an  n  x  n  matrix.  Then  A  is  invertible  if  and  only  if  it  can  be  written  as  a  product  of 
elementary  matrices. 


Consider  the  following  example. 


Example  2.58:  Product  of  Elementary  Matrices 

Let  A  — 

'  0  10' 
1  1  0 

0  -2  1 

.  Write  A  as  a  product  of  elementary  matrices. 

Solution.  We  will  use  the  process  outlined  in  Theorem  2.55  to  write  A  as  a  product  of  elementary  matrices. 
We  will  set  up  the  matrix  [A|7]  and  row  reduce,  recording  each  row  operation  as  an  elementary  matrix. 
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First: 


'  0 

1 

0 

1 

0 

0  ' 

"  1 

1 

0 

0 

1 

0  ' 

1 

1 

0 

0 

1 

0 

-A 

0 

1 

0 

1 

0 

0 

0 

-2 

1 

0 

0 

1 

0 

-2 

1 

0 

0 

1 

represented  by  the  elementary  matrix  E\ 


0  1  0 
1  0  0 
0  0  1 


Secondly: 


'  1 

1 

0 

0 

1 

0  ' 

'  1 

0 

0 

-1 

1 

0  ' 

0 

1 

0 

1 

0 

0 

-A 

0 

1 

0 

1 

0 

0 

0 

-2 

1 

0 

0 

1 

0 

-2 

1 

0 

0 

1 

represented  by  the  elementary  matrix  E2  — 


1  -1  0 

0  1  0 

0  0  1 


Finally: 


'  1 

0 

0 

-1 

1 

0  ' 

'  1 

0 

0 

-1 

1 

0  ' 

0 

1 

0 

1 

0 

0 

-A 

0 

1 

0 

1 

0 

0 

0 

-2 

1 

0 

0 

1 

0 

0 

1 

2 

0 

1 

represented  by  the  elementary  matrix  £3 


1  0  0 
0  1  0 
0  2  1 


Notice  that  the  reduced  row-echelon  form  of  A  is  I.  Hence  I  =  U A  where  U  is  the  product  of  the 
above  elementary  matrices.  It  follows  that  A  —  U~l .  Since  we  want  to  write  A  as  a  product  of  elementary 
matrices,  we  wish  to  express  I/1  as  a  product  of  elementary  matrices. 


U-1  =  (£3^i)_1 
—  p  1  p  1  p  1 


1 

0 

0 

'110' 

1 

0 

0 

1  0  0 

0  1  0 

0  1  0 

.  0  0  1 . 

0  0  1 

0  -2  1 

=  A 


This  gives  A  written  as  a  product  of  elementary  matrices.  By  Theorem  2.57  it  follows  that  A  is  invert¬ 
ible.  4 
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2.1.10.  More  on  Matrix  Inverses 


In  this  section,  we  will  prove  three  theorems  which  will  clarify  the  concept  of  matrix  inverses.  In  order  to 
do  this,  first  recall  some  important  properties  of  elementary  matrices. 

Recall  that  an  elementary  matrix  is  a  square  matrix  obtained  by  performing  an  elementary  operation 
on  an  identity  matrix.  Each  elementary  matrix  is  invertible,  and  its  inverse  is  also  an  elementary  matrix.  If 
E  is  an  m  x  m  elementary  matrix  and  A  is  an  m  x  n  matrix,  then  the  product  EA  is  the  result  of  applying  to 
A  the  same  elementary  row  operation  that  was  applied  to  the  m  x  m  identity  matrix  in  order  to  obtain  E. 

Let  R  be  the  reduced  row-echelon  form  of  an  in  x  n  matrix  A.  R  is  obtained  by  iteratively  applying 
a  sequence  of  elementary  row  operations  to  A.  Denote  by  E\,Ei,---  ,Ek  the  elementary  matrices  asso¬ 
ciated  with  the  elementary  row  operations  which  were  applied,  in  order,  to  the  matrix  A  to  obtain  the 
resulting  R.  We  then  have  that  R  =  (Ek  ■  ■  ■  (£2  (£iA)))  =  Ek  ■  ■  ■  E2E1A.  Let  E  denote  the  product  matrix 
Ek-  ■  ■  Ei_E\  so  that  we  can  write  R  =  EA  where  E  is  an  invertible  matrix  whose  inverse  is  the  product 
(EI)-1(E2)-'-..(Ek)-y 

Now,  we  will  consider  some  preliminary  lemmas. 


Lemma  2.59:  Invertible  Matrix  and  Zeros 


Suppose  that  A  and  B  are  matrices  such  that  the  product  AB  is  an  identity  matrix.  Then  the  reduced 
row-echelon  form  of  A  does  not  have  a  row  of  zeros. 


Proof.  Let  R  be  the  reduced  row-echelon  form  of  A.  Then  R  —  EA  for  some  invertible  square  matrix  E  as 
described  above.  By  hypothesis  AB  =  I  where  I  is  an  identity  matrix,  so  we  have  a  chain  of  equalities 

R(BE _1)  =  (EA)(BE~1)  =  E{AB)E~l  =  EIE  1  =  EE  1  =  1 

If  R  would  have  a  row  of  zeros,  then  so  would  the  product  R(BE  ').  But  since  the  identity  matrix  I  does 
not  have  a  row  of  zeros,  neither  can  R  have  one.  4* 

We  now  consider  a  second  important  lemma. 


Lemma  2.60:  Size  of  Invertible  Matrix 


Suppose  that  A  and  B  are  matrices  such  that  the  product  AB  is  an  identity  matrix.  Then  A  has  at 
least  as  many  columns  as  it  has  rows. 


Proof.  Let  R  be  the  reduced  row-echelon  form  of  A.  By  Lemma  2.59,  we  know  that  R  does  not  have  a  row 
of  zeros,  and  therefore  each  row  of  R  has  a  leading  1.  Since  each  column  of  R  contains  at  most  one  of 
these  leading  Is,  R  must  have  at  least  as  many  columns  as  it  has  rows.  4|k 

An  important  theorem  follows  from  this  lemma. 
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Proof.  Suppose  that  A  and  B  are  matrices  such  that  both  products  AB  and  BA  are  identity  matrices.  We  will 
show  that  A  and  B  must  be  square  matrices  of  the  same  size.  Let  the  matrix  A  have  m  rows  and  n  columns, 
so  that  A  is  an  m  x  n  matrix.  Since  the  product  AB  exists,  B  must  have  n  rows,  and  since  the  product  BA 
exists,  B  must  have  m  columns  so  that  B  is  an  n  x  m  matrix.  To  finish  the  proof,  we  need  only  verify  that 
m  =  n. 

We  first  apply  Lemma  2.60  with  A  and  B,  to  obtain  the  inequality  m  <  n.  We  then  apply  Lemma  2.60 
again  (switching  the  order  of  the  matrices),  to  obtain  the  inequality  n  <  m.  It  follows  that  m  —  n,  as  we 
wanted.  4k 

Of  course,  not  all  square  matrices  are  invertible.  In  particular,  zero  matrices  are  not  invertible,  along 
with  many  other  square  matrices. 

The  following  proposition  will  be  useful  in  proving  the  next  theorem. 


Proposition  2.62:  Reduced  Row-Echelon  Form  of  a  Square  Matrix 


IfR  is  the  reduced  row-echelon  form  of  a  square  matrix,  then  either  R  has  a  row  of  zeros  orR  is  an 
identity  matrix. 


The  proof  of  this  proposition  is  left  as  an  exercise  to  the  reader.  We  now  consider  the  second  important 
theorem  of  this  section. 


Theorem  2.63:  Unique  Inverse  of  a  Matrix 


Suppose  A  and  B  are  square  matrices  such  thatAB  =  I  where  I  is  an  identity  matrix.  Then  it  follows 
that  BA  =  I.  Further,  both  A  and  B  are  invertible  and  B  =  A  1  and  A  —  Bl. 


Proof.  Let  R  be  the  reduced  row-echelon  form  of  a  square  matrix  A.  Then,  R  =  EA  where  E  is  an  invertible 
matrix.  Since  AB  =  I,  Lemma  2.59  gives  us  that  R  does  not  have  a  row  of  zeros.  By  noting  that  R  is  a 
square  matrix  and  applying  Proposition  2.62,  we  see  that  R  —  I.  Hence,  EA  =  /. 

Using  both  that  EA  —  I  and  AB  =  /,  we  can  finish  the  proof  with  a  chain  of  equalities  as  given  by 


BA  =  IBIA 


(. EA)B(E~1E)A 
E(AB)E~\EA ) 
EIE~lI 
EE  1  =  1 


It  follows  from  the  definition  of  the  inverse  of  a  matrix  that  B  =  A  1  and  A  —  B  1 .  4k 

This  theorem  is  very  useful,  since  with  it  we  need  only  test  one  of  the  products  AB  or  BA  in  order  to 
check  that  B  is  the  inverse  of  A.  The  hypothesis  that  A  and  B  are  square  matrices  is  very  important,  and 
without  this  the  theorem  does  not  hold. 

We  will  now  consider  an  example. 
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Solution.  Consider  the  product  A7  A  given  by 


1  0  0 
0  1  0 


1  0 
0  1 
0  0 


1  0 
0  1 


Therefore,  A7  A  =  h,  where  Ii  is  the  2x2  identity  matrix.  However,  the  product  A  A7  is 


1  0 
0  1 
0  0 


1  0  0 
0  1  0 


1  0  0 
0  1  0 
0  0  0 


Hence  AA7  is  not  the  3x3  identity  matrix.  This  shows  that  for  Theorem  2.63,  it  is  essential  that  both 
matrices  be  square  and  of  the  same  size.  4k 


Is  it  possible  to  have  matrices  A  and  B  such  that  AB  —  7,  while  BA  =  0?  This  question  is  left  to  the 
reader  to  answer,  and  you  should  take  a  moment  to  consider  the  answer. 

We  conclude  this  section  with  an  important  theorem. 


Theorem  2.65:  The  Reduced  Row-Echelon  Form  of  an  Invertible  Matrix 


For  any  matrix  A  the  following  conditions  are  equivalent: 

•  A  is  invertible 

•  The  reduced  row-echelon  form  of  A  is  an  identity  matrix 


Proof.  In  order  to  prove  this,  we  show  that  for  any  given  matrix  A,  each  condition  implies  the  other.  We 
first  show  that  if  A  is  invertible,  then  its  reduced  row-echelon  form  is  an  identity  matrix,  then  we  show  that 
if  the  reduced  row-echelon  form  of  A  is  an  identity  matrix,  then  A  is  invertible. 

If  A  is  invertible,  there  is  some  matrix  B  such  that  AB  —  I.  By  Lemma  2.59,  we  get  that  the  reduced  row- 
echelon  form  of  A  does  not  have  a  row  of  zeros.  Then  by  Theorem  2.61,  it  follows  that  A  and  the  reduced 
row-echelon  form  of  A  are  square  matrices.  Finally,  by  Proposition  2.62,  this  reduced  row-echelon  form  of 
A  must  be  an  identity  matrix.  This  proves  the  first  implication. 

Now  suppose  the  reduced  row-echelon  form  of  A  is  an  identity  matrix  I.  Then  7  =  EA  for  some  product 
E  of  elementary  matrices.  By  Theorem  2.63,  we  can  conclude  that  A  is  invertible.  4k 

Theorem  2.65  corresponds  to  Algorithm  2.37,  which  claims  that  A-1  is  found  by  row  reducing  the 
augmented  matrix  [A |7]  to  the  form  [/|A  '] .  This  will  be  a  matrix  product  E  [A|7]  where  E  is  a  product  of 
elementary  matrices.  By  the  rules  of  matrix  multiplication,  we  have  that  E  [A|7]  =  \EA\El\  =  \EA\E], 
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It  follows  that  the  reduced  row-echelon  form  of  [A|7]  is  [EA\E\,  where  EA  gives  the  reduced  row- 
echelon  form  of  A.  By  Theorem  2.65,  if  EA  f  I,  then  A  is  not  invertible,  and  if  EA  —  I,A  is  invertible.  If 
EA  =  /,  then  by  Theorem  2.63,  E  =  A  1 .  This  proves  that  Algorithm  2.37  does  in  fact  find  A  1 . 


Exercises 


Exercise  2.1.1  For  the  following  pairs  of  matrices,  determine  if  the  sum  A  +  B  is  defined.  If  so,  find  the 
sum. 


(a)  A 


1  0 
0  1 


,5  = 


0  1 
1  0 


(b)  A 


2  1  2 
1  1  0 


,5  = 


-10  3 
0  1  4 


(c)  A  = 


1  0 

-2  3 
4  2 


,B  = 

'2  7  -1 ' 
0  3  4 

Exercise  2.1.2  For  each  matrix  A,  find  the  matrix  —A  such  that  A  A-  (—A)  =  0. 


(a)  A 

(b)  A 

(c)  A 


1  2 
2  1 

-2  3  ' 

0  2 

0  1  2 
1  -1  3 
4  2  0 


Exercise  2.1.3  In  the  context  of  Proposition  2. 7,  describe  —A  and  0. 


Exercise  2.1.4  For  each  matrix  A,  find  the  product  (— 2)A,0A,  and  3A. 


(a)  A 


1  2 
2  1 


(b)  A 


-2  3 
0  2 


(c)  A  — 


0  1  2 
1  -1  3 
4  2  0 
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Exercise  2.1.5  Using  only  the  properties  given  in  Proposition  2.7  and  Proposition  2.10,  show  —A  is 
unique. 

Exercise  2.1.6  Using  only  the  properties  given  in  Proposition  2. 7  and  Proposition  2.10  show  0  is  unique. 

Exercise  2.1.7  Using  only  the  properties  given  in  Proposition  2.7  and  Proposition  2.10  show  OA  =  0. 
Here  the  0  on  the  left  is  the  scalar  0  and  the  0  on  the  right  is  the  zero  matrix  of  appropriate  size. 

Exercise  2.1.8  Using  only  the  properties  given  in  Proposition  2.7  and  Proposition  2.10,  as  well  as  previ¬ 
ous  problems  show  (— 1)A  =  —A. 


Exercise  2.1.9  Consider  the  matrices  A 


D  = 


1  2  3 

2  1  7 


,B  = 


3-12 
-3  2  1 


,C  = 


Find  the  following  if  possible.  If  it  is  not  possible  explain  why. 


1  2 
3  1 


(a)  -3 A 

(b)  3 B-A 

(c)  AC 

( d)  CB 

(e)  AE 

(f)  ea 


Exercise  2.1.10 

Consider  the  matrices  A  = 

'  1  2  ' 

3  2 

1  -1 

,B  = 

2-52' 
-3  2  1 

,c  = 

'12' 
5  0 

D  = 

'  -i  r 

4  -3 

,E  = 

'  1  ' 
3 

Find  the  following  if  possible.  If  it  is  not  possible  explain  why. 

(a)  -3 A 

(b)  3 B-A 

(c)  AC 

(d)  CA 

(e)  AE 

(f)  ea 

(g)  be 
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(h)  DE 


Exercise  2.1.11  Let  A  — 

1 

-2 

1  ' 
-1 

,  B  = 

following  if  possible. 

1 

2 

L 

-1  -2 

1  -2 


,  and  C 


I  1  -3 
-12  0 
-3  -1  0 


.  Find  the 


(a)  AB 

(b)  BA 

(c)  AC 

(d)  CA 

(e)  CB 

(f)  BC 


Exercise  2.1.12  Let  A  =  ^ 

Exercise  2.1.13  Let  X  =  [  1 


^  .  Find  all  2x2  matrices,  B  such  that  AB  =  0. 

—  1  1  ]  and  Y  =  [  0  1  2  ]  .  Find  XTY  and  XYr  if  possible. 


Exercise  2.1.14  Let  A 

what  should  k  equal? 


1  2 
3  4 


,B 


1  2 
3  k 


.  Is  it  possible  to  choose  k  such  that  AB 


BAT  If  so. 


Exercise  2.1.15  Let  A 

what  should  k  equal? 


1  2 
3  4 


,B  = 


1  2 
1  k 


.  Is  it  possible  to  choose  k  such  that  AB 


BA?  If  so, 


Exercise  2.1.16  Find  2x2  matrices,  A,  B ,  and  C  such  that  A  f  0,  C  f  B.  but  AC  =  AB. 


Exercise  2.1.17  Give  an  example  of  matrices  (of  any  size),  A,B,C  such  that  B  f  C,  A  f  0,  and  yet  AB  = 
AC. 


Exercise  2.1.18  Find  2x2  matrices  A  and  B  such  that  A  f  0  and  B  but  AB  =  0. 

Exercise  2.1.19  Give  an  example  of  matrices  (of  any  size),  A,B  such  that  A  f  0  and  B  f  0  but  AB  =  0. 
Exercise  2.1.20  Find  2x2  matrices  A  and  B  such  that  A  f  0  and  B  f  0  with  AB  f  BA. 


X\  -X2+  2X3 
2x3  +X| 

3*3 

3*4  +  3*2  +*l 


Exercise  2.1.21  Write  the  system 
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in  the  form  A 


x\ 

X2 

x3 

X4 


where  A  is  an  appropriate  matrix. 


Exercise  2.1.22  Write  the  system 

X\  +  3.T2  +  2x3 

2x3  +'Y1 
6x3 

X4  +  3x2  +xl 


in  the  form  A 


xi 

x2 

x3 

-M 


where  A  is  an  appropriate  matrix. 


Exercise  2.1.23  Write  the  system 

X\  +X2+X3 
2x3  +Xi  +X2 
X3-X1 
3x'4  +  X\ 


in  the  form  A 


X] 

x2 

X3 

X4 


where  A  is  an  appropriate  matrix. 


Exercise  2.1.24  A  matrix  A  is  called  idempotent  if  A2  =A.  Let 


A  = 


2  0  2 

1  1  2 

-1  0  -1 


and  show  that  A  is  idempotent . 


Exercise  2.1.25  For  each  pair  of  matrices,  find  the  (l, 2) -entry  and  (2,3)  -entry  of  the  product  AB. 


"  1  2  -1  ' 

1 

<N 

1 

VO 

(a)  A  = 

3  4  0 

2  5  1 

,B  = 

7  2  1 

-10  0 

"13  1' 

2  3  0  " 

(b)  A  — 

0  2  4 

,B  = 

-4  16  1 

1  0  5 

0  2  2 

Exercise  2.1.26  Suppose  A  and  B  are  square  matrices  of  the  same  size.  Which  of  the  following  are 
necessarily  true? 
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(a)  (A  —  B)2  —  A2  —  2AB  +  B2 

(b)  (AB)2  —  A2B2 

(c)  (A  +  B)2  —  A2  +  2AB  +  B2 

(d)  (A  +  B)2  =  A2+AB  +  BA  +  B2 

(e)  A2B2  =  A(AB)B 

(f)  (A  +  B)3  —A3  +  3A2B  +  3AB2  +  B 3 

(g)  (A  +  B)  (A  —  B)  —  A2  —  B2 


'  1  2  ' 

r 

Exercise  2.1.27  Consider  the  matrices  A  — 

3  2 

1  -1 

,B  = 

D  = 


-1  1 
4  -3 


,E 


1 

3 


-5  2 
2  1 


,C  = 


Find  the  following  if  possible.  If  it  is  not  possible  explain  why. 


1  2 
5  0 


(a)  -3 Ar 

(b)  3 B-At 

(c)  EtB 

( d)  EEt 

(e)  BtB 

(f)  cat 

( g)  DtBE 


Exercise  2.1.28  Let  A  be  an  n  x  n  matrix.  Show  A  equals  the  sum  of  a  symmetric  and  a  skew  symmetric 
matrix.  Hint:  Show  that  j  (  A1  +  A)  is  symmetric  and  then  consider  using  this  as  one  of  the  matrices. 

Exercise  2.1.29  Show  that  the  mean  diagonal  of  every  skew  symmetric  matrix  consists  of  only  zeros. 
Recall  that  the  main  diagonal  consists  of  every  entry  of  the  matrix  which  is  of  the  form  an. 

Exercise  2.1.30  Prove  3.  That  is,  show  that  for  an  m  x  n  matrix  A,  an  nx  p  matrix  B,  and  scalars  r,s,  the 
following  holds: 

(rA  +  sB)T  —  rAT  +  sBT 


Exercise  2.1.31  Prove  that  ImA  —  A  where  A  is  an  m  x  n  matrix. 
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Exercise  2.1.32  Suppose  AB  =  AC  and  A  is  an  invertible  n  x  n  matrix.  Does  it  follow  that  B  —  Cl  Explain 
why  or  why  not. 

Exercise  2.1.33  Suppose  AB  =  AC  and  A  is  a  non  invertible  n  x  n  matrix.  Does  it  follow  that  B  —  C? 
Explain  why  or  why  not. 

Exercise  2.1.34  Give  an  example  of  a  matrix  A  such  that  A2  =  /  and  yet  A  f  1  and  A  f  —  1. 


Exercise  2.1.35  Let 


A  — 


2  1 
-1  3 


Find  A  1  if  possible.  If  A  1  does  not  exist,  explain  why. 


Exercise  2.1.36  Let 


A  = 


0  1 
5  3 


Find  A  1  if  possible.  If  A  1  does  not  exist,  explain  why. 


Exercise  2.1.37  Let 


A  = 


2  1 
3  0 


Find  A  1  if  possible.  If  A  1  does  not  exist,  explain  why. 


Exercise  2.1.38  Let 


A  = 


2  1 
4  2 


Find  A  1  if  possible.  If  A  1  does  not  exist,  explain  why. 


Exercise  2.1.39  Let  A  be  a  2x2  invertible  matrix,  with  A 
a,b,c,d. 


a  b 
c  d 


Find  a  formula  for  A  1  in  terms  of 


Exercise  2.1.40  Let 


A  = 


1  2  3 

2  1  4 
1  0  2 


Find  A  1  if  possible.  If  A  1  does  not  exist,  explain  why. 


Exercise  2.1.41  Let 


A  = 


1  0  3 

2  3  4 
1  0  2 


Find  A  1  if  possible.  If  A  1  does  not  exist,  explain  why. 
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Exercise  2.1.42  Let 


A  = 


1  2  3 

2  1  4 

4  5  10 


Find  A  1  if  possible.  If  A  1  does  not  exist,  explain  why. 


Exercise  2.1.43  Let 

"12  0  2 

1  1  2  0 

A_  2  1-32 
12  12 


Find  A  1  if  possible.  If  A  1  does  not  exist,  explain  why. 


Exercise  2.1.44  Using  the  inverse  of  the  matrix,  find  the  solution  to  the  systems: 


( a) 


(b) 


"24" 

X 

'  l  ' 

1  1 

_  y  _ 

2 

"24" 

X 

'  2  ' 

1  1 

_  y  _ 

0 

Now  give  the  solution  in  terms  of  a  and  b  to 


'  2 

4  ' 

X 

a 

1 

1 

_  y  _ 

b 

Exercise  2.1.45  Using  the  inverse  of  the  matrix,  find  the  solution  to  the  systems: 


( a) 


(b) 


"10  3" 

X 

'  1 ' 

2  3  4 

y 

= 

0 

1  0  2 

_  z  _ 

1 

1 

O 

u> 

X 

3  ' 

2  3  4 

y 

= 

-1 

1  0  2 

_  z  _ 

-2 

Now  give  the  solution  in  terms  ofa,b,  and  c  to  the  following: 


1 

o 

X 

a 

2  3  4 

y 

= 

b 

1  0  2 

z 

c 
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Exercise  2.1.46  Show  that  if  A  is  an  n  x  n  invertible  matrix  and  X  is  a  nx  1  matrix  such  that  AX  —  B  for 
B  an  hx  1  matrix,  then  X  =  A  1 B. 

Exercise  2.1.47  Prove  that  if  A  1  exists  and  AX  =  0  then  X  =  0. 

Exercise  2.1.48  Show  that  if  A  1  exists  for  an  n  x  n  matrix,  then  it  is  unique.  That  is,  if  BA  —  I  and  AB  =  I, 
then  B  =  A-1. 

Exercise  2.1.49  Show  that  if  A  is  an  invertible  n  x  n  matrix,  then  so  is  A T  and  (A1)  1  =  (A~ 1 ) T  . 

Exercise  2.1.50  Show  (AB)~  1  =  B  'A  1  by  verifying  that 

AB{B~lA-1)  =1 

and 

B~lA~l(AB)  =1 

Hint:  Use  Problem  2.1.48. 


Exercise  2.1.51  Show  that  (ABC)  1  =  C  1 B  1 A  1  by  verifying  that 

(ABC)  (CT^A-1)  = I 

and 

{C^lB-lA~l)  (ABC)  —  1 

Hint:  Use  Problem  2.1.48. 

Exercise  2.1.52  If  A  is  invertible,  show  (A2)  1  =  (A-1)2 .  Hint:  Use  Problem  2.1.48. 
Exercise  2.1.53  If  A  is  invertible,  show  (A  1 )  1  =  A.  Hint:  Use  Problem  2.1.48. 


Exercise  2.1.54  Let  A  = 


2  3 
1  2 


Suppose  a  row  operation  is  applied  to  A  and  the  result  is  B  — 


Find  the  elementary  matrix  E  that  represents  this  row  operation. 


1  2 
2  3 


Exercise  2.1.55  Let  A  — 


4  0 
2  1 


Suppose  a  row  operation  is  applied  to  A  and  the  result  is  B  — 


Find  the  elementary  matrix  E  that  represents  this  row  operation. 


8  0 
2  1 


Exercise  2.1.56  Let  A 


1  -3 
0  5 


Suppose  a  row  operation  is  applied  to  A  and  the  result  is  B  = 


1  -3 

2  -1 


Find  the  elementary  matrix  E  that  represents  this  row  operation. 


Exercise  2.1.57  Let  A 


B  = 


1  2  1 
2-14 
0  5  1 


1  2  1 
0  5  1 
2-14 


Suppose  a  row  operation  is  applied  to  A  and  the  result  is 
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(a)  Find  the  elementary  matrix  E  such  that  EA  —  B. 

(b)  Find  the  inverse  ofE,  E  1 ,  such  that  E  1 B  =  A. 


Exercise  2.1.58  Let  A 


B  = 


1  2  1 
0  10  2 
2-14 


1  2  1 
0  5  1 
2-14 


Suppose  a  row  operation  is  applied  to  A  and  the  result  is 


(a)  Find  the  elementary  matrix  E  such  that  EA  =  B. 

(b)  Find  the  inverse  ofE,  E  1 ,  such  that  E~ 1 B  =  A. 


Exercise  2.1.59  Let  A 


1  2  1 

0  5  1 
2-14 


Suppose  a  row  operation  is  applied  to  A  and  the  result  is 


B 


'  1  2  1' 

0  5  1 


( a )  Find  the  elementary  matrix  E  such  that  EA 

(b)  Find  the  inverse  ofE,  E  1 ,  such  that  E  1 B 


B. 

A. 


Exercise  2.1.60  Let  A 


B  = 


1  2  1 
2  4  5 

2-14 


1  2  1 
0  5  1 
2-14 


Suppose  a  row  operation  is  applied  to  A  and  the  result  is 


(a)  Find  the  elementary  matrix  E  such  that  EA  =  B. 

( b)  Find  the  inverse  ofE,  E  1 ,  such  that  E  1 B  =  A. 
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2.2  LU  Factorization 


An  LU  factorization  of  a  matrix  involves  writing  the  given  matrix  as  the  product  of  a  lower  triangular 
matrix  L  which  has  the  main  diagonal  consisting  entirely  of  ones,  and  an  upper  triangular  matrix  U  in  the 
indicated  order.  This  is  the  version  discussed  here  but  it  is  sometimes  the  case  that  the  L  has  numbers 
other  than  1  down  the  main  diagonal.  It  is  still  a  useful  concept.  The  L  goes  with  “lower”  and  the  U  with 
“upper”. 

It  turns  out  many  matrices  can  be  written  in  this  way  and  when  this  is  possible,  people  get  excited 
about  slick  ways  of  solving  the  system  of  equations,  AX  —  B.  It  is  for  this  reason  that  you  want  to  study 
the  LU  factorization.  It  allows  you  to  work  only  with  triangular  matrices.  It  turns  out  that  it  takes  about 
half  as  many  operations  to  obtain  an  LU  factorization  as  it  does  to  find  the  row  reduced  echelon  form. 

First  it  should  be  noted  not  all  matrices  have  an  LU  factorization  and  so  we  will  emphasize  the  tech¬ 
niques  for  achieving  it  rather  than  formal  proofs. 


Example  2.66:  A  Matrix  with  NO  LU  factorization 

Can  you  write 

'01' 
1  0 

in  the  form  LU  as  just  described? 

Solution.  To  do  so  you  would  need 


1  0  ' 

a  b 

a  b 

'  0 

1  ' 

x  1 

0  c 

xa  xb  +  c 

1 

0 

Therefore,  b  =  1  and  a  =  0.  Also,  from  the  bottom  rows,  xa  =  1  which  can’t  happen  and  have  a  —  0. 
Therefore,  you  can’t  write  this  matrix  in  the  form  LU .  It  has  no  LU  factorization.  This  is  what  we  mean 
above  by  saying  the  method  lacks  generality. 

Nevertheless  the  method  is  often  extremely  useful,  and  we  will  describe  below  one  the  many  methods 
used  to  produce  an  LU  factorization  when  possible.  4|k 

2.2.1.  Finding  An  LU  Factorization  By  Inspection 


Which  matrices  have  an  LU  factorization?  It  turns  out  it  is  those  whose  row-echelon  form  can  be  achieved 
without  switching  rows.  In  other  words  matrices  which  only  involve  using  row  operations  of  type  2  or  3 
to  obtain  the  row-echelon  form. 


Example  2.67:  An  LU  factorization 

Find  an  LU  factorization  of  A  — 

"1202' 
13  2  1 

2  3  4  0 
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One  way  to  find  the  LU  factorization  is  to  simply  look  for  it  directly.  You  need 


"1202' 

i 

o 

o 

a  d  h  j 

13  2  1 

= 

v  1  0 

0  b  e  i 

2  3  4  0 

y  z  1 

i 

o 

o 

o 

Then  multiplying  these  you  get 

ad  h  j 

xa  xd  +-  b  xh  +  e  xj  +  i 
_ya  yd  +  zb  yh  +  ze  +  c  yj  +  iz  +  f  _ 

and  so  you  can  now  tell  what  the  various  quantities  equal.  From  the  first  column,  you  need  a  =  l,x  — 
1  ,y  —  2.  Now  go  to  the  second  column.  You  need  d  =  2,xd  +  b  =  3  so  b—  1  ,yd  +  zb  =  3  so  z  —  — 1.  From 
the  third  column,  h  =  0,e  —  2,c  —  6.  Now  from  the  fourth  column,  j  =  2,i  =  —  1,/  =  —  5.  Therefore,  an 
LU  factorization  is 


"  1 

0 

0  ' 

'  1 

2 

0 

2  ' 

1 

1 

0 

0 

1 

2 

-1 

2 

-1 

1 

0 

0 

6 

-5 

You  can  check  whether  you  got  it  right  by  simply  multiplying  these  two. 

2.2.2.  LU  Factorization,  Multiplier  Method 


Remember  that  for  a  matrix  A  to  be  written  in  the  form  A  =  LU,  you  must  be  able  to  reduce  it  to  its 
row-echelon  form  without  interchanging  rows.  The  following  method  gives  a  process  for  calculating  the 
LU  factorization  of  such  a  matrix  A. 


r  i 

Example  2.68:  LU  factorization 

Find  an  LU  factorization  for 

"12  3  ' 

2  3  1 

-2  3  -2 

Solution. 

Write  the  matrix  as  the  following  product. 


i 

o 

o 

1 

CO 

CO 

0  1  0 

2  3  1 

_  0  0  1  _ 

-2  3  -2 

In  the  matrix  on  the  right,  begin  with  the  left  row  and  zero  out  the  entries  below  the  top  using  the  row 
operation  which  involves  adding  a  multiple  of  a  row  to  another  row.  You  do  this  and  also  update  the  matrix 
on  the  left  so  that  the  product  will  be  unchanged.  Here  is  the  first  step.  Take  —2  times  the  top  row  and  add 
to  the  second.  Then  take  2  times  the  top  row  and  add  to  the  second  in  the  matrix  on  the  left. 


i 

o 

o 

1  2  3  ' 

2  1  0 

o 

1 

I 

_  0  0  !  _ 

-2  3  -2 
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The  next  step  is  to  take  2  times  the  top  row  and  add  to  the  bottom  in  the  matrix  on  the  right.  To  ensure  that 
the  product  is  unchanged,  you  place  a  —2  in  the  bottom  left  in  the  matrix  on  the  left.  Thus  the  next  step 
yields 


1 

o 

o 

'  1  2  3  " 

2  1  0 

o 

1 

1 

-2  0  1 

0  7  4 

Next  take  7  times  the  middle  row  on  right  and  add  to  bottom  row.  Updating  the  matrix  on  the  left  in  a 
similar  manner  to  what  was  done  earlier. 


1 

0 

0  ' 

'  1 

2 

3  ' 

2 

1 

0 

0 

-1 

-5 

-2 

-7 

1 

0 

0 

-31 

At  this  point,  stop.  You  are  done.  4 

The  method  just  described  is  called  the  multiplier  method. 

2.2.3.  Solving  Systems  using  LU  Factorization 


One  reason  people  care  about  the  LU  factorization  is  it  allows  the  quick  solution  of  systems  of  equations. 
Here  is  an  example. 


Example  2.69:  LU  factorization  to  Solve  Equations 


Suppose  you  want  to  find  the  solutions  to 


X 

'  1 

2 

3 

2 

y 

1 

4 

3 

1 

1 

= 

2 

1 

2 

3 

0 

z 

3 

w 

Solution. 

Of  course  one  way  is  to  write  the  augmented  matrix  and  grind  away.  However,  this  involves  more  row 
operations  than  the  computation  of  the  LU  factorization  and  it  turns  out  that  the  LU  factorization  can  give 
the  solution  quickly.  Here  is  how.  The  following  is  an  LU  factorization  for  the  matrix. 


"1232" 

i 

o 

o 

'  1  2  3  2  ' 

4  3  11 

= 

4  1  0 

0  -5  -11  -7 

12  3  0 

_  !  0  !  _ 

i 

<N 

1 

O 

o 

o 

Let  UX  —  Y  and  consider  LY  =  B  where  in  this  case,  B  —  [1,2, 3] T .  Thus 


1 

o 

o 

VI 

'  1  " 

4  1  0 

V2 

= 

2 

_  !  0  !  _ 

.  -V3  . 

3 
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which  yields  very  quickly  that  Y  = 


1 

-2 

2 


Now  you  can  find  X  by  solving  UX  =  Y.  Thus  in  this  case, 


which  yields 


X 

12  3  2 

1 ' 

0  -5  -11  -7 

y 

= 

-2 

0  0  0  -2 

z 

2 

w 

t 

-1 


* 


2.2.4.  Justification  for  the  Multiplier  Method 


Why  does  the  multiplier  method  work  for  finding  the  LU  factorization?  Suppose  A  is  a  matrix  which  has 
the  property  that  the  row-echelon  form  for  A  may  be  achieved  without  switching  rows.  Thus  every  row 
which  is  replaced  using  this  row  operation  in  obtaining  the  row-echelon  form  may  be  modified  by  using 
a  row  which  is  above  it. 


Lemma  2.70:  Multiplier  Method  and  Triangular  Matrices 


Let  L  be  a  lower  (upper)  triangular  matrix  in  x  in  which  has  ones  down  the  main  diagonal.  Then 
L  1  also  is  a  lower  (upper)  triangular  matrix  which  has  ones  down  the  main  diagonal.  In  the  case 
that  L  is  of  the  form 

1 


L  = 


Cl\  1 


(2.11) 


a 


n 


l 


where  all  entries  are  zero  except  for  the  left  column  and  main  diagonal,  it  is  also  the  case  that  L  1 
is  obtained  from  L  by  simply  multiplying  each  entry  below  the  main  diagonal  in  L  with  —1.  The 
same  is  true  if  the  single  nonzero  column  is  in  another  position. 


Proof.  Consider  the  usual  setup  for  finding  the  inverse  [  L  I  ]  .  Then  each  row  operation  done  to  L  to 
reduce  to  row  reduced  echelon  form  results  in  changing  only  the  entries  in  I  below  the  main  diagonal.  In 
the  special  case  of  L  given  in  2.11  or  the  single  nonzero  column  is  in  another  position,  multiplication  by 
—  1  as  described  in  the  lemma  clearly  results  in  L  1 .  4k 
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For  a  simple  illustration  of  the  last  claim. 


■  1 

0 

0 

1 

0 

0  " 

■  1 

0 

0 

1 

0 

0  ' 

0 

1 

0 

0 

1 

0 

— >■ 

0 

1 

0 

0 

1 

0 

0 

a 

1 

0 

0 

1 

0 

0 

1 

0 

—a 

1 

Now  let  A  be  an  m  x  n  matrix,  say 


an 

A 12  • 

abi 

«21 

fl22  • 

a2n 

Ami 

@ m2 

amn 

and  assume  A  can  be  row  reduced  to  an  upper  triangular  form  using  only  row  operation  3.  Thus,  in 
particular,  a\\  ^  0.  Multiply  on  the  left  by  E\  = 


1  0  0 
_Si  i  ...  0 

au 

_ ^ml  Q  ...  J 

«11 

This  is  the  product  of  elementary  matrices  which  make  modifications  in  the  first  column  only.  It  is  equiv¬ 
alent  to  taking  —  «2i  /«i  l  times  the  first  row  and  adding  to  the  second.  Then  taking  —1/3 1  / a  \  \  times  the 
first  row  and  adding  to  the  third  and  so  forth.  The  quotients  in  the  first  column  of  the  above  matrix  are  the 
multipliers.  Thus  the  result  is  of  the  form 


E\A  = 


a  11  a  12 


0 


0 


a 


22 


'im2 


ll  n 

An 


By  assumption,  a22  /  0  and  so  it  is  possible  to  use  this  entry  to  zero  out  all  the  entries  below  it  in  the  matrix 
on  the  right  by  multiplication  by  a  matrix  of  the  form  E2  —  ^ 


of  the  form 


E  = 


1 


0 


“32  1 


u22 


0 


0  E 

•  0 
•  0 


1 


where  E  is  an  [m  —  1]  x  [m  —  1]  matrix 


u22 


Again,  the  entries  in  the  first  column  below  the  1  are  the  multipliers.  Continuing  this  way,  zeroing  out  the 
entries  below  the  diagonal  entries,  finally  leads  to 


Em- 1  En—i  ■  --EiA  —  U 

where  U  is  upper  triangular.  Each  Ej  has  all  ones  down  the  main  diagonal  and  is  lower  triangular.  Now 
multiply  both  sides  by  the  inverses  of  the  Ej  in  the  reverse  order.  This  yields 

A  —  F~1F~1  ■  ■  ■  F  1  II 
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By  Lemma  2.70,  this  implies  that  the  product  of  those  E  ■  1  is  a  lower  triangular  matrix  having  all  ones 
down  the  main  diagonal. 

The  above  discussion  and  lemma  gives  the  justification  for  the  multiplier  method.  The  expressions 

— «2i  —^31  —ami 

9  9  *  *  *  9 

a\\  a\\  a\i 

denoted  respectively  by  A/21,  ■  ■  ■  ,Mm  1  to  save  notation  which  were  obtained  in  building  £j  are  the  multi¬ 
pliers.  Then  according  to  the  lemma,  to  find  1  you  simply  write 

1  0  ■■■  0  ' 

-A/21  1  •••0 

—Mini  0  •••  1 

Similar  considerations  apply  to  the  other  EJ1 .  Thus  L  is  a  product  of  the  form 


1 

0  • 

•  0  ' 

■  1 

0 

o' 

—A/21 

1  • 

•  0 

0 

1 

0 

0 

Mm\ 

0  • 

•  1 

0 

Mfn[m—  1]  1 

each  factor  having  at  most  one  nonzero  column,  the  position  of  which  moves  from  left  to  right  in  scan¬ 
ning  the  above  product  of  matrices  from  left  to  right.  It  follows  from  what  we  know  about  the  effect  of 
multiplying  on  the  left  by  an  elementary  matrix  that  the  above  product  is  of  the  form 


1 

0 

0 

0  ' 

—A/21 

1 

0 

0 

—A/32 

— ^[M-l]l 

1 

0 

.  —Mm\ 

—A/m  2 

•••  —MMm-  1 

1  . 

In  words,  beginning  at  the  left  column  and  moving  toward  the  right,  you  simply  insert,  into  the  corre¬ 
sponding  position  in  the  identity  matrix,  —  1  times  the  multiplier  which  was  used  to  zero  out  an  entry  in 
that  position  below  the  main  diagonal  in  A,  while  retaining  the  main  diagonal  which  consists  entirely  of 
ones.  This  is  L. 


Exercises 


1  2  0 

2  1  3 
1  2  3 


Exercise  2.2.1  Find  an  LU  factorization  of 
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Exercise  2.2.2  Find  an  LU  factorization  of 


12  3  2 

13  2  1 

5  0  13 


Exercise  2.2.3  Find  an  LU  factorization  of  the  matrix 


1-2  -5  0 

-2  5  11  3 

3  -6  -15  1 


Exercise  2.2.4  Find  an  LU  factorization  of  the  matrix 


1  -1  -3  -1 

-12  4  3 

2  -3  -7  -3 


Exercise  2.2.5  Find  an  LU  factorization  of  the  matrix 


1 

-3 

-4 

-3 

3 

10 

10 

10 

1 

-6 

2 

-5 

Exercise  2.2.6  Find  an  LU  factorization  of  the  matrix 


13  1-1 

3  10  8-1 

2  5-3-3 


Exercise  2.2.7  Find  an  LU  factorization  of  the  matrix 


Exercise  2.2.8  Find  an  LU  factorization  of  the  matrix 


Exercise  2.2.9  Find  an  LU  factorization  of  the  matrix 


3 

-2 

1  ' 

9 

-8 

6 

-6 

2 

2 

3 

2 

-7 

-3 

-1 

3 

9 

9 

-12 

3 

19 

-16 

12 

40 

-26 

-1 

-3 

-1  ' 

1 

3 

0 

3 

9 

0 

4 

12 

16 

Exercise  2.2.10  Find  the  LU  factorization  of  the  coefficient  matrix  using  Dolittle’s  method  and  use  it  to 
solve  the  system  of  equations. 

x  +  2y  =  5 
2x  +  3  y  =  6 


Exercise  2.2.11  Find  the  LU  factorization  of  the  coefficient  matrix  using  Dolittle’s  method  and  use  it  to 
solve  the  system  of  equations. 

x  +  2  y  +  z=  1 
y  +  3z  =  2 
2x  +  3}’  =  6 
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Exercise  2.2.12  Find  the  LU  factorization  of  the  coefficient  matrix  using  Dolittle’s  method  and  use  it  to 
solve  the  system  of  equations. 

x  +  2y  +  3z  =  5 
2x  +  3  y  +  z  —  6 
x-y  +  z  =  2 

Exercise  2.2.13  Find  the  LU  factorization  of  the  coefficient  matrix  using  Dolittle’s  method  and  use  it  to 
solve  the  system  of  equations. 

x  +  2y  +  3z  =  5 
2x  +  3y  +  z  =  6 
3x  +  5>’  +  4z=  11 


Exercise  2.2.14  Is  there  only  one  LU  factorization  for  a  given  matrix?  Hint:  Consider  the  equation 

o  1 1  _  r  i  o  l  r  o  i 

o  i  J  _  [  i  i  J  [  o  o 

Look  for  all  possible  LU  factorizations. 


3.  Determinants 


3.1  Basic  Techniques  and  Properties 


Outcomes 


A.  Evaluate  the  determinant  of  a  square  matrix  using  either  Laplace  Expansion  or  row  operations. 

B.  Demonstrate  the  effects  that  row  operations  have  on  determinants. 

C.  Verify  the  following: 

(a)  The  determinant  of  a  product  of  matrices  is  the  product  of  the  determinants. 

(b)  The  determinant  of  a  matrix  is  equal  to  the  determinant  of  its  transpose. 


3.1.1.  Cofactors  and  2x2  Determinants 


Let  A  be  an  n  x  n  matrix.  That  is,  let  A  be  a  square  matrix.  The  determinant  of  A,  denoted  by  det  (A)  is  a 
very  important  number  which  we  will  explore  throughout  this  section. 

If  A  is  a  2x2  matrix,  the  determinant  is  given  by  the  following  formula. 


Definition  3.1:  Determinant  of  a  Two  By  Two  Matrix 

Let  A  — 

a  b 
c  d 

.  Then 

det  (A)  —ad  —  cb 

The  determinant  is  also  often  denoted  by  enclosing  the  matrix  with  two  vertical  lines.  Thus 


a  b 

a  b 

c  d 

c  d 

The  following  is  an  example  of  finding  the  determinant  of  a  2  x  2  matrix. 
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Example  3.2:  A  Two  by  Two  Determinant 

Find  det  (A)  for  the  matrix  A  = 

2  4' 
-1  6 

Solution.  From  Definition  3.1, 


det(A)  =  (2)  (6)  —  (—1)  (4)  -  12  +  4=  16 


* 

The  2  x  2  determinant  can  be  used  to  find  the  determinant  of  larger  matrices.  We  will  now  explore  how 
to  find  the  determinant  of  a  3  x  3  matrix,  using  several  tools  including  the  2x2  determinant. 

We  begin  with  the  following  definition. 


Definition  3.3:  The  ijth  Minor  of  a  Matrix 


Let  A  be  a  3x3  matrix.  The  i  jth  minor  of  A,  denoted  as  minor  (A )  ,  is  the  determinant  of  the  2x2 
matrix  which  results  from  deleting  the  ith  row  and  the  jth  column  of  A. 

In  general,  if  A  is  an  n  x  n  matrix,  then  the  i  jth  minor  of  A  is  the  determinant  of  the  n—  1  x  n  —  1 
matrix  which  results  from  deleting  the  ith  row  and  the  jth  column  of  A. 


Hence,  there  is  a  minor  associated  with  each  entry  of  A.  Consider  the  following  example  which 
demonstrates  this  definition. 


Solution.  First  we  will  find  minor  (A)12.  By  Definition  3.3,  this  is  the  determinant  of  the  2x2  matrix 
which  results  when  you  delete  the  first  row  and  the  second  column.  This  minor  is  given  by 


minor  (A)  ]2  =  det 


4  2 
3  1 


Using  Definition  3.1,  we  see  that 


det 


4  2 
3  1 


(4)  ( 1)  -  (3)  (2)  =  4  -  6  =  -2 


Therefore  minor  (A)  12  =  —2. 
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Similarly,  minor  (A)23  is  the  determinant  of  the  2x2  matrix  which  results  when  you  delete  the  second 
row  and  the  third  column.  This  minor  is  therefore 


minor  (A)23  =  det 


1  2 
3  2 


=  -4 


Finding  the  other  minors  of  A  is  left  as  an  exercise. 

The  i  jth  minor  of  a  matrix  A  is  used  in  another  important  definition,  given  next. 


* 


It  is  also  convenient  to  refer  to  the  cofactor  of  an  entry  of  a  matrix  as  follows.  If  a,/  is  the  i  jth  entry  of 
the  matrix,  then  its  cofactor  is  just  cof  (A)- . 


Solution.  We  will  use  Definition  3.5  to  compute  these  cofactors. 

First,  we  will  compute  cof  (A  )  12.  Therefore,  we  need  to  find  minor  (A)  12.  This  is  the  determinant  of 
the  2x2  matrix  which  results  when  you  delete  the  first  row  and  the  second  column.  Thus  minor  (A)  12  is 
given  by 


Then, 

cof(A)12  =  (— l)1+2iH/n*?r(A)  ,2  =  (-l)1+2(-2)  =2 

Hence,  cof  (A)  12  =  2. 

Similarly,  we  can  find  cof(A)23.  First,  find  minor  (A)23,  which  is  the  determinant  of  the  2x2  matrix 
which  results  when  you  delete  the  second  row  and  the  third  column.  This  minor  is  therefore 


det 


1  2 
3  2 


-4 


Hence, 


c°f(A)23  =  (  —  1  )2+3  minor  (A)23  =  (-1)2+3  (-4)  =  4 
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* 

You  may  wish  to  find  the  remaining  cofactors  for  the  above  matrix.  Remember  that  there  is  a  cofactor 
for  every  entry  in  the  matrix. 

We  have  now  established  the  tools  we  need  to  find  the  determinant  of  a  3  x  3  matrix. 


Definition  3.7:  The  Determinant  of  a  Three  By  Three  Matrix 


Let  A  be  a  3x3  matrix.  Then,  det(A)  is  calculated  by  picking  a  row  (or  column)  and  taking  the 
product  of  each  entry  in  that  row  (column )  with  its  cofactor  and  adding  these  products  together. 
This  process  when  applied  to  the  ith  row  (column)  is  known  as  expanding  along  the  ith  row  (col¬ 
umn)  as  is  given  by 

det(A)  =  ancof(A)n  +  aacof(A)i2  +  aj3cof(A)i3 


When  calculating  the  determinant,  you  can  choose  to  expand  any  row  or  any  column.  Regardless  of 
your  choice,  you  will  always  get  the  same  number  which  is  the  determinant  of  the  matrix  A.  This  method  of 
evaluating  a  determinant  by  expanding  along  a  row  or  a  column  is  called  Laplace  Expansion  or  Cofactor 
Expansion. 

Consider  the  following  example. 


Solution.  First,  we  will  calculate  det(A)  by  expanding  along  the  first  column.  Using  Definition  3.7,  we 
take  the  1  in  the  first  column  and  multiply  it  by  its  cofactor, 


1(-1)1+1 


3  2 
2  1 


(1)(1)(-1) - 1 


Similarly,  we  take  the  4  in  the  first  column  and  multiply  it  by  its  cofactor,  as  well  as  with  the  3  in  the  first 
column.  Finally,  we  add  these  numbers  together,  as  given  in  the  following  equation. 

COf(A)u  COf(A)21  COf(A)31 


det(A)  =  1(-1) 


l+i 


3  2 
2  1 


+  4(-l) 


2+1 


2  3 
2  1 


+  3(— 1) 


3+1 


2  3 

3  2 


Calculating  each  of  these,  we  obtain 


det(A)  =  l(l)(-l)  +  4(— l)(-4)  +  3(l)(-5)  =  — 1  +  16  +  — 15  =  0 


Hence,  det  (A)  =  0. 
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As  mentioned  in  Definition  3.7,  we  can  choose  to  expand  along  any  row  or  column.  Let’s  try  now  by 
expanding  along  the  second  row.  Here,  we  take  the  4  in  the  second  row  and  multiply  it  to  its  cofactor,  then 
add  this  to  the  3  in  the  second  row  multiplied  by  its  cofactor,  and  the  2  in  the  second  row  multiplied  by  its 
cofactor.  The  calculation  is  as  follows. 


COf(A) 


21 


COf(A) 


22 


COf(A) 


23 


det  (A)  =4(— 1) 


2+1 


2 

2 


+  3(-l) 


2+2 


+  2(-l) 


2+3 


Calculating  each  of  these  products,  we  obtain 


det(A)  —  4(— 1)  (—2)  +3(1)  (— 8)  +2(— 1)  (—4)  =  0 


You  can  see  that  for  both  methods,  we  obtained  det  (A)  =  0.  4k 

As  mentioned  above,  we  will  always  come  up  with  the  same  value  for  det  (A)  regardless  of  the  row  or 
column  we  choose  to  expand  along.  You  should  try  to  compute  the  above  determinant  by  expanding  along 
other  rows  and  columns.  This  is  a  good  way  to  check  your  work,  because  you  should  come  up  with  the 
same  number  each  time! 

We  present  this  idea  formally  in  the  following  theorem. 


Theorem  3.9:  The  Determinant  is  Well  Defined 


Expanding  the  nxn  matrix  along  any  row  or  column  always  gives  the  same  answer,  which  is  the 
determinant. 


We  have  now  looked  at  the  determinant  of  2  x  2  and  3x3  matrices.  It  turns  out  that  the  method  used 
to  calculate  the  determinant  of  a  3  x  3  matrix  can  be  used  to  calculate  the  determinant  of  any  sized  matrix. 
Notice  that  Definition  3.3,  Definition  3.5  and  Definition  3.7  can  all  be  applied  to  a  matrix  of  any  size. 

For  example,  the  i  jth  minor  of  a  4  x  4  matrix  is  the  determinant  of  the  3x3  matrix  you  obtain  when  you 
delete  the  ith  row  and  the  jth  column.  Just  as  with  the  3x3  determinant,  we  can  compute  the  determinant 
of  a  4  x  4  matrix  by  Laplace  Expansion,  along  any  row  or  column 

Consider  the  following  example. 


r  i 

Example  3.10:  Determinant  of  a  Four  by  Four  Matrix 

Find  det  (A)  where 

A  = 

"1234' 
5  4  2  3 
13  4  5 

3  4  3  2 

Solution.  As  in  the  case  of  a  3  x  3  matrix,  you  can  expand  this  along  any  row  or  column.  Lets  pick  the 
third  column.  Then,  using  Laplace  Expansion, 


det  (A)  =  3  (— 1)1+3 

5  4  3 

1  3  5 

+  2(  — 1)2+3 

1  2  4 

1  3  5 

3  4  2 

3  4  2 
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1  2  4 

1  2  4 

4(_l)3+3 

5  4  3 

3  4  2 

+  3(— 1)4+3 

5  4  3 

1  3  5 

Now,  you  can  calculate  each  3x3  determinant  using  Laplace  Expansion,  as  we  did  above.  You  should 
complete  these  as  an  exercise  and  verify  that  det  (A)  =  — 12.  4k 


The  following  provides  a  formal  definition  for  the  determinant  of  an  n  x  n  matrix.  You  may  wish 
to  take  a  moment  and  consider  the  above  definitions  for  2  x  2  and  3x3  determinants  in  context  of  this 
definition. 


In  the  following  sections,  we  will  explore  some  important  properties  and  characteristics  of  the  deter¬ 
minant. 

3.1.2.  The  Determinant  of  a  Triangular  Matrix 


There  is  a  certain  type  of  matrix  for  which  finding  the  determinant  is  a  very  simple  procedure.  Consider 
the  following  definition. 


The  following  theorem  provides  a  useful  way  to  calculate  the  determinant  of  a  triangular  matrix. 
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Theorem  3.13:  Determinant  of  a  Triangular  Matrix 


Let  A  be  an  upper  or  lower  triangular  matrix.  Then  det  (A)  is  obtained  by  taking  the  product  of  the 
entries  on  the  main  diagonal. 


The  verification  of  this  Theorem  can  be  done  by  computing  the  determinant  using  Laplace  Expansion 
along  the  first  row  or  column. 

Consider  the  following  example. 


— 

Example  3.14:  Determinant  of  a  Triangular  Matrix 

Let 

A  = 

Find  det  (A) . 

'  1  2  3  77  ' 

0  2  6  7 

0  0  3  33.7 

0  0  0  -1 

Solution.  From  Theorem  3.13,  it  suffices  to  take  the  product  of  the  elements  on  the  main  diagonal.  Thus 

det  (A)  =  1  x  2  x  3  x  (—1)  =  —6. 

Without  using  Theorem  3.13,  you  could  use  Laplace  Expansion.  We  will  expand  along  the  first  column. 
This  gives 


det  (A)  =  1 

2  6  7 

0  3  33.7 

+  0(-l)2+1 

2  3  77 

0  3  33.7 

0  0-1 

0  0-1 

0(-l)3+1 

2  3  77 

2  6  7 

+  0(-l)4+1 

2  3  77 

2  6  7 

0  0-1 

0  3  33.7 

and  the  only  nonzero  term  in  the  expansion  is 

2  6  7 

1  0  3  33.7 
0  0-1 


Now  find  the  determinant  of  this  3x3  matrix,  by  expanding  along  the  first  column  to  obtain 


det  (A)  =  1  x  (  2  x 


3  33.7 
0  -1 


+  0(-l) 


2+1 


6 

0 


7 

-1 


+o(-i) 


3+1 


6  7 

3  33.7 


Next  use  Definition  3.1  to  find  the  determinant  of  this  2x2  matrix,  which  is  just  3  x  —  1  —  0  x  33.7  =  —3. 
Putting  all  these  steps  together,  we  have 


det  (A)  =  1  x  2  x  3  x  (—1)  =  —6 
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which  is  just  the  product  of  the  entries  down  the  main  diagonal  of  the  original  matrix!  4k 

You  can  see  that  while  both  methods  result  in  the  same  answer,  Theorem  3.13  provides  a  much  quicker 
method. 

In  the  next  section,  we  explore  some  important  properties  of  determinants. 

3.1.3.  Properties  of  Determinants  I:  Examples 


There  are  many  important  properties  of  determinants.  Since  many  of  these  properties  involve  the  row 
operations  discussed  in  Chapter  1,  we  recall  that  definition  now. 


We  will  now  consider  the  effect  of  row  operations  on  the  determinant  of  a  matrix.  In  future  sections, 
we  will  see  that  using  the  following  properties  can  greatly  assist  in  finding  determinants.  This  section  will 
use  the  theorems  as  motivation  to  provide  various  examples  of  the  usefulness  of  the  properties. 

The  first  theorem  explains  the  affect  on  the  determinant  of  a  matrix  when  two  rows  are  switched. 


When  we  switch  two  rows  of  a  matrix,  the  determinant  is  multiplied  by  —  1 .  Consider  the  following 
example. 


Example  3.17:  Switching  Two  Rows 

Let  A  — 

'12' 
3  4 

and  let  B  = 

'34' 

1  2 

.  Knowing  that  det  (A)  =  —2,  find  det  ( B ) . 

Solution.  By  Definition  3.1,  det  ( A )  =  1x4  —  3x2  =  —2.  Notice  that  the  rows  of  B  are  the  rows  of  A  but 
switched.  By  Theorem  3.16  since  two  rows  of  A  have  been  switched,  det(fi)  =  —  det  (A)  =  —  (—2)  =  2. 
You  can  verify  this  using  Definition  3.1.  4k 

The  next  theorem  demonstrates  the  effect  on  the  determinant  of  a  matrix  when  we  multiply  a  row  by  a 
scalar. 
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Theorem  3.18:  Multiplying  a  Row  by  a  Scalar 


Let  A  be  an  n  x  n  matrix  and  let  B  be  a  matrix  which  results  from  multiplying  some  row  of  A  by  a 
scalar  k.  Then  det  (B)  —  k  det  (A ) . 


Notice  that  this  theorem  is  true  when  we  multiply  one  row  of  the  matrix  by  k.  If  we  were  to  multiply 
two  rows  of  A  by  k  to  obtain  B,  we  would  have  det  (B)  =  k 2  det  (A) .  Suppose  we  were  to  multiply  all  n 
rows  of  A  by  k  to  obtain  the  matrix  B,  so  that  B  =  kA.  Then,  det  (6)  =  k"  det  (A).  This  gives  the  next 
theorem. 


Consider  the  following  example. 


Example  3.20:  Multiplying  a  Row  by  5 

Let  A  — 

'12' 
3  4 

,£  = 

'5  10  ' 
3  4 

.  Knowing  that  det  (A)  =  —2,  find  det  ( B ) . 

Solution.  By  Definition  3.1,  det(A)  =  —2.  We  can  also  compute  det(B)  using  Definition  3.1,  and  we  see 
that  det(B)  =  — 10. 

Now,  let’s  compute  det  (B)  using  Theorem  3.18  and  see  if  we  obtain  the  same  answer.  Notice  that  the 
first  row  of  B  is  5  times  the  first  row  of  A,  while  the  second  row  of  B  is  equal  to  the  second  row  of  A.  By 
Theorem  3.18,  det  (B)  =  5  x  det  (A)  =  5  x  —2  =  —10. 

You  can  see  that  this  matches  our  answer  above.  4k 

Finally,  consider  the  next  theorem  for  the  last  row  operation,  that  of  adding  a  multiple  of  a  row  to 
another  row. 


Theorem  3.21:  Adding  a  Multiple  of  a  Row  to  Another  Row 


Let  A  be  an  n  x  n  matrix  and  let  B  be  a  matrix  which  results  from  adding  a  multiple  of  a  row  to 
another  row.  Then  det  (A)  =  det  (B). 


Therefore,  when  we  add  a  multiple  of  a  row  to  another  row,  the  determinant  of  the  matrix  is  unchanged. 
Note  that  if  a  matrix  A  contains  a  row  which  is  a  multiple  of  another  row,  det  (A)  will  equal  0.  To  see  this, 
suppose  the  first  row  of  A  is  equal  to  —1  times  the  second  row.  By  Theorem  3.21,  we  can  add  the  first  row 
to  the  second  row,  and  the  determinant  will  be  unchanged.  However,  this  row  operation  will  result  in  a 
row  of  zeros.  Using  Laplace  Expansion  along  the  row  of  zeros,  we  find  that  the  determinant  is  0. 

Consider  the  following  example. 
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Example  3.22:  Adding  a  Row  to  Another  Row 

Let  A  — 

'12' 
3  4 

and  let  B  = 

'12' 
5  8 

.  Find  det  ( B ) . 

Solution.  By  Definition  3.1,  det  (A)  =  —2.  Notice  that  the  second  row  of  B  is  two  times  the  first  row  of  A 
added  to  the  second  row.  By  Theorem  3.16,  det  (B)  =  det  (A)  =  —2.  As  usual,  you  can  verify  this  answer 
using  Definition  3.1.  4k 


Example  3.23:  Multiple  of  a  Row 

Let  A  = 

'12' 
2  4 

.  Show  that  det  (A)  =  0. 

Solution.  Using  Definition  3.1,  the  determinant  is  given  by 

det  (A)  =  1x4  —  2x2  =  0 

However  notice  that  the  second  row  is  equal  to  2  times  the  first  row.  Then  by  the  discussion  above 
following  Theorem  3.21  the  determinant  will  equal  0.  4k 

Until  now,  our  focus  has  primarily  been  on  row  operations.  However,  we  can  carry  out  the  same 
operations  with  columns,  rather  than  rows.  The  three  operations  outlined  in  Definition  3.15  can  be  done 
with  columns  instead  of  rows.  In  this  case,  in  Theorems  3.16,  3.18,  and  3.21  you  can  replace  the  word, 
"row"  with  the  word  "column". 

There  are  several  other  major  properties  of  determinants  which  do  not  involve  row  (or  column)  opera¬ 
tions.  The  first  is  the  determinant  of  a  product  of  matrices. 


In  order  to  find  the  determinant  of  a  product  of  matrices,  we  can  simply  take  the  product  of  the  deter¬ 
minants. 


Consider  the  following  example. 


r  i 

Example  3.25:  The  Determinant  of  a  Product 

Compare  det  (AB)  and  det  (A)  det  (B)  fc 

A  = 

>r 

12' 
-3  2 

,B  = 

'32' 
4  1 
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Solution.  First  compute  AB,  which  is  given  by 


and  so  by  Definition  3.1 


Now 


and 


AB 


1 

<N 

H 

1 _ 

"32' 

l 

1 - 

1 

K> 

1 _ 

4  1 

- 1 

1 

7 

det  (AB)  =  det 

det  (A)  —  det 
det  (B)  —  det 


=  -40 

=  8 

=  —5 


Computing  det  (A)  x  det  ( B )  we  have  8  x  —  5  =  —40.  This  is  the  same  answer  as  above  and  you  can 
see  that  det  (A)  det  (B)  =  8  x  (—5)  =  —40  =  det(AB).  4 

Consider  the  next  important  property. 


This  theorem  is  illustrated  in  the  following  example. 


Solution.  First,  note  that 


Using  Definition  3.1,  we  can  compute  det  (A)  and  det  ( Ar ).  It  follows  that  det  (A)  =  2x3  — 4x5  = 
— 14  and  det  (Ar)  =  2x3  —  5x4=— 14.  Hence,  det  (A)  =  det  (Ar) .  4 

The  following  provides  an  essential  property  of  the  determinant,  as  well  as  a  useful  way  to  determine 
if  a  matrix  is  invertible. 
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Theorem  3.28:  Determinant  of  the  Inverse 


Let  A  be  an  n  x  n  matrix.  Then  A  is  invertible  if  and  only  if  det(A)  /  0.  If  this  is  true,  it  follows  that 


det(A~J) 


1 

det(A) 


Consider  the  following  example. 


Example  3.29:  Determinant  of  an  Invertible  Matrix 

Let  A  = 

determina 

"36' 
2  4 
mt  of  the 

,B  = 

?  inversi 

"23" 
5  1 

e. 

.  For  each  matrix,  determine  if  it  is  invertible.  If  so,  find  the 

Solution.  Consider  the  matrix  A  first.  Using  Definition  3.1  we  can  find  the  determinant  as  follows: 

det(A)  =  3x4  — 2x6  =  12—12  —  0 

By  Theorem  3.28  A  is  not  invertible. 

Now  consider  the  matrix  B.  Again  by  Definition  3.1  we  have 

det  (B)  =  2  x  1  -  5  x  3  =  2  —  15  =  -13 

By  Theorem  3.28  B  is  invertible  and  the  determinant  of  the  inverse  is  given  by 

det  (A-1)  =  1 

v  '  det(A) 

1 

-13 

1 

~13 

* 

3.1.4.  Properties  of  Determinants  II:  Some  Important  Proofs 


This  section  includes  some  important  proofs  on  determinants  and  cofactors. 

First  we  recall  the  definition  of  a  determinant.  If  A  =  [a,j]  is  an  n  x  n  matrix,  then  det  A  is  defined  by 
computing  the  expansion  along  the  first  row: 

n 

detA  =  ^aijCof(A)ij.  (3.1) 

i=i 


If  n  =  1  then  detA  =  a\,\ . 
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The  following  example  is  straightforward  and  strongly  recommended  as  a  means  for  getting  used  to 
definitions. 


Example  3.30: 


(1)  LetEjj  be  the  elementary  matrix  obtained  by  interchanging  ith  and  jth  rows  of  I.  Then  det  E]}  = 
-1. 

(2)  Let  Ejk  be  the  elementary  matrix  obtained  by  multiplying  the  ith  row  of  I  by  k.  Then  det  E^  —  k. 

(3)  Let  Eijk  be  the  elementary  matrix  obtained  by  multiplying  ith  row  of  I  by  k  and  adding  it  to  its 
jth  row.  Then  det E^  —  1. 

(4)  If  C  and  B  are  such  that  CB  is  defined  and  the  ith  row  of  C  consists  of  zeros,  then  the  ith  row  of 
CB  consists  of  zeros. 

(5)  IfE  is  an  elementary  matrix,  then  det E  —  det ET . 


Many  of  the  proofs  in  section  use  the  Principle  of  Mathematical  Induction.  This  concept  is  discussed 
in  Appendix  A. 2  and  is  reviewed  here  for  convenience.  First  we  check  that  the  assertion  is  true  for  n  —  2 
(the  case  n  =  1  is  either  completely  trivial  or  meaningless). 

Next,  we  assume  that  the  assertion  is  true  for  n  —  1  (where  n  >  3)  and  prove  it  for  n.  Once  this  is 
accomplished,  by  the  Principle  of  Mathematical  Induction  we  can  conclude  that  the  statement  is  true  for 
all  nxn  matrices  for  every  n  >  2. 

If  A  is  an  n  x  n  matrix  and  1  <j<n,  then  the  matrix  obtained  by  removing  1st  column  and  jth  row 
from  A  is  an  n  —  1  x  n  —  1  matrix  (we  shall  denote  this  matrix  by  A  (  /')  below).  Since  these  matrices  are  used 
in  computation  of  cofactors  cof(A)  y7,  for  1  <  /  /  n,  the  inductive  assumption  applies  to  these  matrices. 

Consider  the  following  lemma. 


Proof.  We  will  prove  this  lemma  using  Mathematical  Induction. 

If  n  —  2  this  is  easy  (check!). 

Let  n  >  3  be  such  that  every  matrix  of  size  h-1x«-1  with  a  row  consisting  of  zeros  has  determinant 
equal  to  zero.  Let  i  be  such  that  the  ith  row  of  A  consists  of  zeros.  Then  we  have  a,j  =  0  for  1  <  j  <  n. 

Fix  j  e  (1,2 such  that  j  ^  i.  Then  matrix  A(j)  used  in  computation  of  cof(A)  |j  has  a  row 
consisting  of  zeros,  and  by  our  inductive  assumption  cof(A)  \  j  =  0. 

On  the  other  hand,  if  j  =  i  then  ci]j  =  0.  Therefore  aijeof(A)  i  ;  =  0  for  all  j  and  by  (3. 1)  we  have 

n 

detA  =  ^  ai,jcof(A)ij  =  0 
j= i 


as  each  of  the  summands  is  equal  to  0. 


4 
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Lemma  3.32: 


Assume  A,  B  and  C  are  n  x  n  matrices  that  for  some  1  <  i  <  n  satisfy  the  following. 

1.  jth  rows  of  all  three  matrices  are  identical,  for  j  /  i. 

2.  Each  entry  in  the  jth  row  of  A  is  the  sum  of  the  corresponding  entries  in  jth  rows  of  B  and  C. 
Then  detA  =  dclB  +  det  C. 


Proof.  This  is  not  difficult  to  check  for  n  =  2  (do  check  it!). 

Now  assume  that  the  statement  of  Lemma  is  true  for  n-lxn-l  matrices  and  fix  A,B  and  C  as 
in  the  statement.  The  assumptions  state  that  we  have  ci/j  =  b[j  —  cij  for  j  /  i  and  for  1  <  l  <  n  and 
cii  i  =  b[j  +  C[j  for  all  1  <  /  <  n.  Therefore  A(j)  —  B(i)  —  C(i),  and  A(j)  has  the  property  that  its  zth  row 
is  the  sum  of  zth  rows  of  B(j )  and  C(j)  for  j  /  i  while  the  other  rows  of  all  three  matrices  are  identical. 
Therefore  by  our  inductive  assumption  we  have  cof(A)i;-  =  cof(B)ij  +  cof(C)ij  for  j  ^  i. 

By  (3.1)  we  have  (using  all  equalities  established  above) 

n 

detA  =  ^aijcof(A)ij 

i= l 

=  Y*afl  (cof(£)  12  +  cof(C)  u)  +  {b\i  +  Cy,)eof(A)  IJ 
¥i 

—  detB  +  detC 

This  proves  that  the  assertion  is  true  for  all  n  and  completes  the  proof.  4k 


Proof.  We  prove  all  statements  by  induction.  The  case  n  —  2  is  easily  checked  directly  (and  it  is  strongly 
suggested  that  you  do  check  it). 

We  assume  n  >  3  and  (l)-(4)  are  true  for  all  matrices  of  size  n-lxn-l. 

(1)  We  prove  the  case  when  j  —  i  +  1,  i.e.,  we  are  interchanging  two  consecutive  rows. 

Let  /  G  {1, \  { /,  / } .  Then  A(/)  is  obtained  from  B(l)  by  interchanging  two  of  its  rows  (draw  a 
picture)  and  by  our  assumption 


cof(A)u  =  — cof(£)u. 


(3.2) 
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Now  consider  aijcof(A)ij.  We  have  that  ay,-  =  b\j  and  also  that  A(z')  —  B(j).  Since  j  —  i+  1,  we  have 

(-l)1+'  =  (-l)1+i+1  =  -(-l)1+i 

and  therefore  ai,-cof(A)  u  —  —b\jCof{B)  i  j  and  a\ ;-cof(A) \j  —  — fii,-cof(Z?)  y.  Putting  this  together  with  (3.2) 
into  (3. 1)  we  see  that  if  in  the  formula  for  detA  we  change  the  sign  of  each  of  the  summands  we  obtain  the 
formula  for  det /l. 

n  n 

detA  =  ^fli/cof(A)i/  =  —  ^biiBu  —  detZl. 

l=\  i=i 

We  have  therefore  proved  the  case  of  (1)  when  j  —  i+  1.  In  order  to  prove  the  general  case,  one  needs 
the  following  fact.  If  i  <  j,  then  in  order  to  interchange  z'th  and  j'th  row  one  can  proceed  by  interchanging 
two  adjacent  rows  2(j  —  i)  +  1  times:  First  swap  zth  and  i  +  1st,  then  i  +  1st  and  i  +  2nd,  and  so  on.  After 
one  interchanges  j  —  1st  and  / 1 h  row,  we  have  zth  row  in  position  of  j'th  and  /th  row  in  position  of  l  —  1st 
for  i  +  1  <  /  <  j.  Then  proceed  backwards  swapping  adjacent  rows  until  everything  is  in  place. 

Since  2 (j  —  i)  +  1  is  an  odd  number  (  —  l)2(f_!)+1  =  — 1  and  we  have  that  detA  =  — detB. 

(2)  This  is  like  (1). . .  but  much  easier.  Assume  that  (2)  is  true  for  all  zr  —  1  x  n  —  1  matrices.  We 
have  that  ajt  —  kb ^  for  1  <  j  <  n.  In  particular  a \ ,■  =  kbu,  and  for  l  /  i  matrix  A(Z)  is  obtained  from 
B(l)  by  multiplying  one  of  its  rows  by  k.  Therefore  cof(A)i/  =  kcof(B)\i  for  l  ^  i,  and  for  all  l  we  have 
rzi/cof(A)i;  =  khi/cof(B)i/.  By  (3.1),  we  have  detA  =  kdtiB. 

(3)  This  is  a  consequence  of  (1).  If  two  rows  of  A  are  identical,  then  A  is  equal  to  the  matrix  obtained 
by  interchanging  those  two  rows  and  therefore  by  (1)  detA  =  —  detA.  This  implies  detA  =  0. 

(4)  Assume  (4)  is  true  for  all  n  —  1  x  n  —  1  matrices  and  fix  A  and  B  such  that  A  is  obtained  by  multi¬ 
plying  zth  row  of  B  by  k  and  adding  it  to  j'th  row  of  B  (z  /  /)  then  detA  =  detB.  If  k  =  0  then  A  —  B  and 
there  is  nothing  to  prove,  so  we  may  assume  kjk  0. 

Let  C  be  the  matrix  obtained  by  replacing  the  j'th  row  of  B  by  the  zth  row  of  B  multiplied  by  k.  By 
Lemma  3.32,  we  have  that 

detA  =  deti?  +  detC 

and  we  ‘only’  need  to  show  that  detC  =  0.  But  zth  and  jth  rows  of  C  are  proportional.  If  D  is  obtained  by 
multiplying  the  j'th  row  of  C  by  |  then  by  (2)  we  have  detC  =  j  det D  (recall  that  k  /  0!).  But  z'th  and  j'th 
rows  of  D  are  identical,  hence  by  (3)  we  have  det D  =  0  and  therefore  detC  =  0. 


Proof.  If  A  is  an  elementary  matrix  of  either  type,  then  multiplying  by  A  on  the  left  has  the  same  effect  as 
performing  the  corresponding  elementary  row  operation.  Therefore  the  equality  det (4/1)  =  detAdet/1  in 
this  case  follows  by  Example  3.30  and  Theorem  3.33. 

If  C  is  the  reduced  row-echelon  form  of  A  then  we  can  write  A  =  E\  ■  Z?2 . Em  C  for  some  elementary 

matrices  E\,...,Em. 

Now  we  consider  two  cases. 
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Assume  first  that  C  —  I.  Then  A  =  E\  ■  E2 . Em  and  AB  —  E\  ■  E2 . EinB.  By  applying  the  above 

equality  m  times,  and  then  m  —  1  times,  we  have  that 

det AB  =  dclT’i  deti?2  •  det Em  ■  dct  B 

—  det (Ei  •  £2 . Em)  det  B 

=  det  A  det  B. 

Now  assume  C  ^  I.  Since  it  is  in  reduced  row-echelon  form,  its  last  row  consists  of  zeros  and  by  (4) 
of  Example  3.30  the  last  row  of  CB  consists  of  zeros.  By  Lemma  3.31  we  have  detC  =  det(CE)  =  0  and 
therefore 

detA  =  det (Ei  •  E2  •  Em)  •  det(C)  =  det(Ej  ■  E2  •  Em)  -0  =  0 

and  also 

det  AB  =  det(Ei  •  E2  •  Em)  •  det(CE)  =  det(Ei  •  E2 . Em)0  =  0 

hence  det  AB  =  0  =  detA  detE.  4k 

The  same  ‘machine’  used  in  the  previous  proof  will  be  used  again. 


Proof.  Note  first  that  the  conclusion  is  true  if  A  is  elementary  by  (5)  of  Example  3.30. 

Let  C  be  the  reduced  row-echelon  form  of  A.  Then  we  can  write  A  —  E\  -  E2 . E,„C.  Then  AT  — 

CT  ■  Ejn . Ej  •  Ei .  By  Theorem  3.34  we  have 

det(Ar)  =  det(Cr)  •  det(E,^) . det(Ej)  •  det(Ei). 

By  (5)  of  Example  3.30  we  have  that  dct Ej  =  detE^  for  all  j.  Also,  detC  is  either  0  or  1  (depending  on 
whether  C  —  I  or  not)  and  in  either  case  detC  =  det CT .  Therefore  detA  =  detA7  .  4k 

The  above  discussions  allow  us  to  now  prove  Theorem  3.9.  It  is  restated  below. 


Theorem  3.36: 


Expanding  an  n  x  n  matrix  along  any  row  or  column  always  gives  the  same  result,  which  is  the 
determinant. 


Proof.  We  first  show  that  the  determinant  can  be  computed  along  any  row.  The  case  n—  1  does  not  apply 
and  thus  let  n  >  2. 

Let  Abe  an  n  x  n  matrix  and  fix  j  >  1 .  We  need  to  prove  that 

n 

detA  =  ^  aj  jcof  (A) 
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Let  us  prove  the  case  when  j  =  2. 

Let  B  be  the  matrix  obtained  from  A  by  interchanging  its  1st  and  2nd  rows.  Then  by  Theorem  3.33  we 
have 

detA  =  —  detB. 

Now  we  have 

n 

det£  =  ^by-cof(.B)y. 
i=  1 

Since  B  is  obtained  by  interchanging  the  1st  and  2nd  rows  of  A  we  have  that  by  =  <22  j  for  all  i  and  one 
can  see  that  minor(B)  y  =  minor(A)2j. 

Further, 

cof(£)y  =  minor Bij  —  —(—l)2+lminor{A)2j  —  — cof(A)2,(- 

hence  det#  =  —  £”=1  «2,/Cof(A)2,;-,  and  therefore  detA  =  —  det B  —  Y!\=\  «2,Nof(A)2,/  as  desired. 

The  case  when  j  >  2  is  very  similar;  we  still  have  minor(B)u  =  minor{A) y  but  checking  that  det B  = 
—  Y!l=\ «;,jCof(A)y  is  slightly  more  involved. 

Now  the  cofactor  expansion  along  column  j  of  A  is  equal  to  the  cofactor  expansion  along  row  j  of  AT , 
which  is  by  the  above  result  just  proved  equal  to  the  cofactor  expansion  along  row  1  of  Ar ,  which  is  equal 
to  the  cofactor  expansion  along  column  1  of  A.  Thus  the  cofactor  cofactor  along  any  column  yields  the 
same  result. 

Finally,  since  detA  =  detAr  by  Theorem  3.35,  we  conclude  that  the  cofactor  expansion  along  row  1 
of  A  is  equal  to  the  cofactor  expansion  along  row  1  of  Ar ,  which  is  equal  to  the  cofactor  expansion  along 
column  1  of  A.  Thus  the  proof  is  complete.  4k 

3.1.5.  Finding  Determinants  using  Row  Operations 


Theorems  3.16,  3.18  and  3.21  illustrate  how  row  operations  affect  the  determinant  of  a  matrix.  In  this 
section,  we  look  at  two  examples  where  row  operations  are  used  to  find  the  determinant  of  a  large  matrix. 
Recall  that  when  working  with  large  matrices,  Laplace  Expansion  is  effective  but  timely,  as  there  are 
many  steps  involved.  This  section  provides  useful  tools  for  an  alternative  method.  By  first  applying  row 
operations,  we  can  obtain  a  simpler  matrix  to  which  we  apply  Laplace  Expansion. 

While  working  through  questions  such  as  these,  it  is  useful  to  record  your  row  operations  as  you  go 
along.  Keep  this  in  mind  as  you  read  through  the  next  example. 


r  i 

Example  3.37:  Finding  a  Determinant 

Find  the  determinant  of  the  matrix 

A  = 

'12  34' 

5  1  2  3 

4  5  4  3 

2  2-45 

Solution.  We  will  use  the  properties  of  determinants  outlined  above  to  find  det  (A).  First,  add  —5  times 
the  first  row  to  the  second  row.  Then  add  —4  times  the  first  row  to  the  third  row,  and  —2  times  the  first 
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row  to  the  fourth  row.  This  yields  the  matrix 


1 

2 

3 

4 

0 

-9 

-13 

-17 

0 

-3 

-8 

-13 

0 

-2 

-10 

-3 

Notice  that  the  only  row  operation  we  have  done  so  far  is  adding  a  multiple  of  a  row  to  another  row. 
Therefore,  by  Theorem  3.21,  det  ( B )  =  det  (A) . 

At  this  stage,  you  could  use  Laplace  Expansion  to  find  det  ( B ) .  However,  we  will  continue  with  row 
operations  to  find  an  even  simpler  matrix  to  work  with. 

Add  —3  times  the  third  row  to  the  second  row.  By  Theorem  3.21  this  does  not  change  the  value  of  the 
determinant.  Then,  multiply  the  fourth  row  by  —3.  This  results  in  the  matrix 

'  1  2  3  4  ' 

0  0  11  22 

C~  0-3-8  -13 
_  0  6  30  9  _ 

Here,  det(C)  =  —3  det  (B),  which  means  that  det  (B)  —  (— |)  det(C) 

Since  det  (A)  =  det(B),  we  now  have  that  det  (A)  =  (— ^)det(C).  Again,  you  could  use  Laplace 
Expansion  here  to  find  det(C).  However,  we  will  continue  with  row  operations. 

Now  replace  the  add  2  times  the  third  row  to  the  fourth  row.  This  does  not  change  the  value  of  the 
determinant  by  Theorem  3.21.  Finally  switch  the  third  and  second  rows.  This  causes  the  determinant  to 
be  multiplied  by  —1.  Thus  det(C)  =  —  det(D)  where 


1 

2 

3 

4 

0 

-3 

-8 

-13 

0 

0 

11 

22 

0 

0 

14 

-17 

Hence,  det(A)  =  (— ^)det(C)  =  (j)det(D) 

You  could  do  more  row  operations  or  you  could  note  that  this  can  be  easily  expanded  along  the  first 
column.  Then,  expand  the  resulting  3x3  matrix  also  along  the  first  column.  This  results  in 


det  (D)  =  l(-3) 


11  22 
14  -17 


1485 


and  so  det  (A)  =  (^)  (1485)  =  495.  4 

You  can  see  that  by  using  row  operations,  we  can  simplify  a  matrix  to  the  point  where  Laplace  Ex¬ 
pansion  involves  only  a  few  steps.  In  Example  3.37,  we  also  could  have  continued  until  the  matrix  was  in 
upper  triangular  form,  and  taken  the  product  of  the  entries  on  the  main  diagonal.  Whenever  computing  the 
determinant,  it  is  useful  to  consider  all  the  possible  methods  and  tools. 

Consider  the  next  example. 
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r  ^ 

Example  3.38:  Find  the  Determinant 

Find  the  determinant  of  the  matrix 

A  = 

'  1  232' 

1-321 

2  12  5 

3-412 

Solution.  Once  again,  we  will  simplify  the  matrix  through  row  operations.  Add  —1  times  the  first  row  to 
the  second  row.  Next  add  —2  times  the  first  row  to  the  third  and  finally  take  —3  times  the  first  row  and  add 
to  the  fourth  row.  This  yields 

"  1 
0 
0 
0 


£  = 


By  Theorem  3.21,  det(A)  =  del (/>’). 

Remember  you  can  work  with  tl 
second  column.  This  yields 


Take  —5  times  the  fourth  column  and  add  to  the 


C  = 


By  Theorem  3.21  dct  (A )  =  det(C). 
Now  take  —1  times  the  third  row 


2 

3 

2 

-5 

-1 

-1 

-3 

-4 

1 

-10 

-8 

-4 

;  also. 

Take 

-5 

-8 

3 

2 

0 

-1 

-1 

-8 

-4 

1 

10 

-8 

-4 

the  top 

row. 

Thi: 

0 

7 

1 

0 

-1 

-1 

-8 

-4 

1 

10 

-8 

-4 

which  by  Theorem  3.21  has  the  same  determinant  as  A. 

Now,  we  can  find  det(Z))  by  expanding  along  the  first  column  as  follows.  You  can  see  that  there  will 
be  only  one  non  zero  term. 


det  ( D )  =  1  det 


0 

-8 

10 


-1 

-1 

-4 

1 

-8 

-4 

+0+0+0 


Expanding  again  along  the  first  column,  we  have 
det  ( D )  =  1  |  0  +  8  det  ^ 


-1 

-4 


+  lOdet 


-1 

-4 


=  -82 


Now  since  det  (A)  =  det  (D),  it  follows  that  det  (A)  =  —82.  4k 

Remember  that  you  can  verify  these  answers  by  using  Laplace  Expansion  on  A.  Similarly,  if  you  first 
compute  the  determinant  using  Laplace  Expansion,  you  can  use  the  row  operation  method  to  verify. 
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Exercises 


Exercise  3.1.1  Find  the  determinants  of  the  following  matrices. 


(a) 


1  3 
0  2 


(b) 


0  3 
0  2 


(c) 


4  3 
6  2 


Exercise  3.1.2  Let  A  — 

(a)  minor  (A)  n 

(b)  minor  (A)  21 

(c)  minor  (A)  22 

(d)  cof(A)n 

(e)  cof(A)2 1 

(f)  cof(A) 32 


1  2  4 
0  1  3 
-2  5  1 


.  Find  the  following. 


Exercise  3.1.3  Find  the  determinants  of  the  following  matrices. 


"  1 

2  3  ' 

(a) 

3 

2  2 

_  0 

9  8  . 

"  4 

3  2 

(b) 

1 

7  8 

_  3 

-9  3 

_ 

"  1 

2  3 

2 

1 

3  2 

3 

(c) 

4 

1  5 

0 

1 

2  1 

2 
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Exercise  3.1.4  Find  the  following  determinant  by  expanding  along  the  first  row  and  second  column. 

1  2  1 
2  1  3 
2  1  1 


Exercise  3.1.5  Find  the  following  determinant  by  expanding  along  the  first  column  and  third  row. 

1  2  1 
1  0  1 
2  1  1 


Exercise  3.1.6  Find  the  following  determinant  by  expanding  along  the  second  row  and  first  column. 

1  2  1 
2  1  3 
2  1  1 


Exercise  3.1.7  Compute  the  determinant  by  cofactor  expansion.  Pick  the  easiest  row  or  column  to  use. 

10  0  1 
2  110 
0  0  0  2 
2  13  1 


Exercise  3.1.8  Find  the  determinant  of  the  following  matrices. 


(a)  A 


1  -34 
0  2 


(b)  A  — 


4  3  14 
0-2  0 
0  0  5 


(c)  A  = 


2  3  15  0 

0  4  17 

0  0-35 
0  0  0  1 


Exercise  3.1.9  An  operation  is  done  to  get  from  the  first  matrix  to  the  second.  Identify  what  was  done  and 
tell  how  it  will  affect  the  value  of  the  determinant. 


a  b 
c  d 


— >■ - )• 


a  c 
b  cl 
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Exercise  3.1.10  An  operation  is  done  to  get  from  the  first  matrix  to  the  second.  Identify  what  was  done 
and  tell  how  it  will  affect  the  value  of  the  determinant. 


a  b 
c  d 


-a - y 


c  d 
a  b 


Exercise  3.1.11  An  operation  is  done  to  get  from  the  first  matrix  to  the  second.  Identify  what  was  done 
and  tell  how  it  will  affect  the  value  of  the  determinant. 


a  b 
c  d 


-A - > 


a  b 
a  +  c  b  +  d 


Exercise  3.1.12  An  operation  is  done  to  get  from  the  first  matrix  to  the  second.  Identify  what  was  done 
and  tell  how  it  will  affect  the  value  of  the  determinant. 


a  b 

a 

b 

c  d 

-A  ■  ■ 

— y 

2c 

Id 

Exercise  3.1.13  An  operation  is  done  to  get  from  the  first  matrix  to  the  second.  Identify  what  was  done 
and  tell  how  it  will  affect  the  value  of  the  determinant. 


a  b 
c  d 


-A - >■ 


b  a 
d  c 


Exercise  3.1.14  Let  A  be  an  rxr  matrix  and  suppose  there  are  r  —  1  rows  (columns)  such  that  all  rows 
(columns)  are  linear  combinations  of  these  r  —  1  rows  (columns).  Show  det(A)  =  0. 

Exercise  3.1.15  Show  det  (aA)  =  a'1  det  (A)  for  an  n  x  n  matrix  A  and  scalar  a. 

Exercise  3.1.16  Construct  2x2  matrices  A  and  B  to  show  that  the  detA  det B  =  det(AB). 

Exercise  3.1.17  Is  it  true  that  det  (A  +  B)  —  det  (A)  +  det  (Bj?  If  this  is  so,  explain  why.  If  it  is  not  so,  give 
a  counter  example. 

Exercise  3.1.18  An  nxn  matrix  is  called  nilpotent  if  for  some  positive  integer,  k  it  follows  Ak  =  0.  If  A  is 
a  nilpotent  matrix  and  k  is  the  smallest  possible  integer  such  that  Ak  =  0,  what  are  the  possible  values  of 
det  (A)? 

Exercise  3.1.19  A  matrix  is  scud  to  be  orthogonal  ifATA  =  1.  Thus  the  inverse  of  an  orthogonal  matrix  is 
just  its  transpose.  What  are  the  possible  values  of  det  (A)  if  A  is  an  orthogonal  matrix? 
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Exercise  3.1.20  Let  A  and  B  be  two  n  x  n  matrices.  A  ~  B  (A  is  similar  to  B)  means  there  exists  an 
invertible  matrix  P  such  that  A  =  P  1  BP.  Show  that  if  A  ~  B,  then  det  (A)  =  det  (B) . 

Exercise  3.1.21  Tell  whether  each  statement  is  true  or  false.  If  true,  provide  a  proof.  If  false,  provide  a 
counter  example. 

(a)  If  A  is  a  3x3  matrix  with  a  zero  determinant,  then  one  column  must  be  a  multiple  of  some  other 
column. 

(b)  If  any  two  columns  of  a  square  matrix  are  equal,  then  the  determinant  of  the  matrix  equals  zero. 

(c)  For  two  n  x  n  matrices  A  and  B,  det  (A+B)  —  det  (A)  +  det  (B) . 

(d)  For  an  n  x  n  matrix  A,  det(3A)  =  3det(A) 

(e)  If  A"1  exists  then  det  (A  1 )  =  dct(Aj  1 . 

(f)  IfB  is  obtained  by  multiplying  a  single  row  of  A  by  4  then  det  (B)  =  4  det  (A) . 

(g)  For  A  an  n  x  n  matrix,  det  (—A)  =  (— l),!det(A) . 

(h)  If  A  is  a  real  n  x  n  matrix,  then  det  ( AT  A )  >  0. 

( i)  IfAk  =  0  for  some  positive  integer  k,  then  det  (A)  =  0. 

(j)  If  AX  =  0  for  some  X  f  0,  then  det  (A)  =  0. 

Exercise  3.1.22  Find  the  determinant  using  row  operations  to  first  simplify. 

1  2  1 
2  3  2 
-4  1  2 


Exercise  3.1.23  Find  the  determinant  using  row  operations  to  first  simplify. 

2  1  3 

2  4  2 

1  4  -5 


Exercise  3.1.24  Find  the  determinant  using  row  operations  to  first  simplify. 

12  12 
3  1-2  3 

-10  3  1 

2  3  2  -2 
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Exercise  3.1.25  Find  the  determinant  using  row  operations  to  first  simplify. 

14  12 

3  2-2  3 

-10  3  3 

2  1  2-2 


3.2  Applications  of  the  Determinant 


Outcomes 


A.  Use  determinants  to  determine  whether  a  matrix  has  an  inverse,  and  evaluate  the  inverse  using 
co factors. 

B.  Apply  Cramer’s  Rule  to  solve  a2x  2  or  a3  x  3  linear  system. 

C.  Given  data  points,  find  an  appropriate  interpolating  polynomial  and  use  it  to  estimate  points. 


3.2.1.  A  Formula  for  the  Inverse 


The  determinant  of  a  matrix  also  provides  a  way  to  find  the  inverse  of  a  matrix.  Recall  the  definition  of 
the  inverse  of  a  matrix  in  Definition  2.33.  We  say  that  A-1 ,  an  n  x  n  matrix,  is  the  inverse  of  A,  also  n  x  n, 
if  AA^1  =  I  and  A~*A  =  I. 

We  now  define  a  new  matrix  called  the  cofactor  matrix  of  A.  The  cofactor  matrix  of  A  is  the  matrix 
whose  i  jth  entry  is  the  i  jth  cofactor  of  A.  The  formal  definition  is  as  follows. 


Definition  3.39:  The  Cofactor  Matrix 

Let  A  —  [c 
cof  (A)  — 

i,j\  he  an 
cof(A)u 

n  x  n  matrix.  Then  the  cofactor  matrix  of  A,  denoted  cof  (A),  is  defined  by 
where  cof  (A)  -  is  the  ijth  cofactor  of  A. 

Note  that  cof  (A),-  ■  denotes  the  i jth  entry  of  the  cofactor  matrix. 

We  will  use  the  cofactor  matrix  to  create  a  formula  for  the  inverse  of  A.  First,  we  define  the  adjugate 
of  A  to  be  the  transpose  of  the  cofactor  matrix.  We  can  also  call  this  matrix  the  classical  adjoint  of  A,  and 
we  denote  it  by  adj  (A). 

In  the  specific  case  where  A  is  a  2  x  2  matrix  given  by 
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then  adj  (A)  is  given  by 


adj  (A)  = 


d  —b 
—c  a 


In  general,  adj  (A)  can  always  be  found  by  taking  the  transpose  of  the  cofactor  matrix  of  A.  The 
following  theorem  provides  a  formula  for  A^1  using  the  determinant  and  adjugate  of  A. 


Notice  that  the  first  formula  holds  for  any  n  x  n  matrix  A,  and  in  the  case  A  is  invertible  we  actually 
have  a  formula  for  A  1 . 

Consider  the  following  example. 


Example  3.41:  Find  Inverse  Using  the  Determinant 

Find  the  inverse  of  the  matrix 

A  = 

using  the  formula  in  Theorem  3.40. 

'12  3' 
3  0  1 

1  2  1 

Solution.  According  to  Theorem  3.40, 


A”1 


1 

det  (A) 


adj  (A) 


First  we  will  find  the  determinant  of  this  matrix.  Using  Theorems  3.16,  3.18,  and  3.21,  we  can  first 
simplify  the  matrix  through  row  operations.  First,  add  —3  times  the  first  row  to  the  second  row.  Then  add 
—  1  times  the  first  row  to  the  third  row  to  obtain 


B  = 


1  2  3 

0  -6  -8 
0  0-2 


By  Theorem  3.21,  det(A)  =  det(B).  By  Theorem  3.13,  det(B)  =  1  x  —6  x  —2  =  12.  Hence,  det(A)  =  12. 
Now,  we  need  to  find  adj  (A).  To  do  so,  first  we  will  find  the  cofactor  matrix  of  A.  This  is  given  by 


cof  (A) 


-2  -2  6 
4-2  0 

2  8-6 
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Here,  the  i  jth  entry  is  the  i jth  cofactor  of  the  original  matrix  A  which  you  can  verify.  Therefore,  from 
Theorem  3.40,  the  inverse  of  A  is  given  by 


Remember  that  we  can  always  verify  our  answer  for  A  1 .  Compute  the  product  AA  1  and  A  lA  and 
make  sure  each  product  is  equal  to  I. 

Compute  A-1  A  as  follows 


A_1A  = 


ill 
6  3  6 

1  _!  2 

6  6  3 


■  1 

2 

3  ' 

'  1 

0 

0  " 

3 

0 

1 

= 

0 

1 

0 

1 

2 

1 

0 

0 

1 

You  can  verify  that  AA  1  =  /  and  hence  our  answer  is  correct. 

We  will  look  at  another  example  of  how  to  use  this  formula  to  find  A”1 


* 


Solution.  First  we  need  to  find  det  (A).  This  step  is  left  as  an  exercise  and  you  should  verify  that  det  (A)  = 
g.  The  inverse  is  therefore  equal  to 


(1/6) 


A 


adj  (A)  =  6  adj  (A) 
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We  continue  to  calculate  as  follows.  Here  we  show  the  2x2  determinants  needed  to  find  the  cofactors. 


- 

1  1 

1  1 

1  1 

- 

3  2 

6  2 

6  3 

2  1 

— 

5  1 

5  2 

3  2 

6  2 

6  3 

0  \ 

1  1 

2  2 

~2  0 

— 

2  1 

5  1 

— 

5  2 

3  2 

6  2 

6  3 

0  i 

1  1 

2  2 

~2  0 

1  1 

— 

1  1 

1  1 

3  2 

6  2 

6  3 

- 

- 

Expanding  all  the  2x2  determinants,  this  yields 


l  l 

6  3 


A-1  =  6 


l  l  _  l 

3  6  3 

i  i  I 

6  6  6 


1  2  -1 

2  1  1 

1  -2  1 


Again,  you  can  always  check  your  work  by  multiplying  A  1 A  and  AA  1  and  ensuring  these  products 
equal  7. 


r  i  o  ±  i 

2  v  2 

'  1  2  -1  ' 

1  1  1 

'10  0' 

A-1A  = 

2  1  1 

1  -2  1 

6  3  2 

5  2  1 

6  3  2 

0  1  0 

0  0  1 

This  tells  us  that  our  calculation  for  A  1  is  correct.  It  is  left  to  the  reader  to  verify  that  AA  1  =  7.  A 

The  verification  step  is  very  important,  as  it  is  a  simple  way  to  check  your  work!  If  you  multiply  A_1A 
and  AA  1  and  these  products  are  not  both  equal  to  7,  be  sure  to  go  back  and  double  check  each  step.  One 
common  error  is  to  forget  to  take  the  transpose  of  the  cofactor  matrix,  so  be  sure  to  complete  this  step. 

We  will  now  prove  Theorem  3.40. 

Proof,  (of  Theorem  3.40)  Recall  that  the  (1,7) -entry  of  adj(A)  is  equal  to  cof(A)  /(.  Thus  the  (1,7) -entry  of 
B  =  A-  adj(A)  is  : 

n  n 

Bi j  =  Y  aik?di{A)kj  =  Y  aikcof(A)jk 

k=  1  k=  1 

By  the  cofactor  expansion  theorem,  we  see  that  this  expression  for  Bjj  is  equal  to  the  determinant  of  the 
matrix  obtained  from  A  by  replacing  its  yth  row  by  an,aa,  ■  ■  ■  cijn  —  i.e.,  its  7th  row. 

If  i  —  j  then  this  matrix  is  A  itself  and  therefore  Bn  =  detA.  If  on  the  other  hand  7  /  y,  then  this  matrix 
has  its  7th  row  equal  to  its  yth  row,  and  therefore  Bjj  =  0  i n  his  case.  Thus  we  obtain: 

A  adj  (A)  =  det  (A)/ 

Similarly  we  can  verify  that: 

adj  (A)  A  =  det  (A)/ 


134  Determinants 


And  this  proves  the  first  part  of  the  theorem. 

Further  if  A  is  invertible,  then  by  Theorem  3.24  we  have: 

1  =  det  (7)  =  det  (AA^1)  =  det  (A)  det  (A^1) 


and  thus  det  (A)  y  0.  Equivalently,  if  det  (A)  =  0,  then  A  is  not  invertible. 

Finally  if  det  (A)  y  0,  then  the  above  formula  shows  that  A  is  invertible  and  that: 


A”1 


1 

det  (A) 


adj  (A) 


This  completes  the  proof.  4 

This  method  for  finding  the  inverse  of  A  is  useful  in  many  contexts.  In  particular,  it  is  useful  with 
complicated  matrices  where  the  entries  are  functions,  rather  than  numbers. 

Consider  the  following  example. 


Example  3.43:  Inverse  for  Non-Constant  Matrix 

Suppose 

"  0 

0 

A  (t)  = 

0  cost 

sint 

0  —  sint 

cost 

Show  that  A  (iy  1  exists  and  then  find  it. 

J 

Solution.  First  note  det  (A  ( t )) 


e(  (cos2 t  +  sin2 t)  —  e1  y  0  so  A  (t)  1  exists. 


The  cofactor  matrix  is 


and  so  the  inverse  is 


C(<) 


1  0  0 

0  e1  cost  er  sin t 
0  — <?fsin t  ef  cost 


'  1 

0 

0 

T 

e  1 

0 

0 

0 

ef  cos  t 

el  sin  t 

— 

0 

cos  t 

—  sint 

0 

—e1  sint 

e1  cost 

0 

sint 

cost 

* 
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3.2.2.  Cramer’s  Rule 


Another  context  in  which  the  formula  given  in  Theorem  3.40  is  important  is  Cramer’s  Rule.  Recall  that 
we  can  represent  a  system  of  linear  equations  in  the  form  AX  =  B,  where  the  solutions  to  this  system 
are  given  by  X.  Cramer’s  Rule  gives  a  formula  for  the  solutions  X  in  the  special  case  that  A  is  a  square 
invertible  matrix.  Note  this  rule  does  not  apply  if  you  have  a  system  of  equations  in  which  there  is  a 
different  number  of  equations  than  variables  (in  other  words,  when  A  is  not  square),  or  when  A  is  not 
invertible. 

Suppose  we  have  a  system  of  equations  given  by  AX  =  B,  and  we  want  to  find  solutions  X  which 
satisfy  this  system.  Then  recall  that  if  A-1  exists, 


AX  =  B 


A-1  (AX)  =  A 
(A_1A)X  =  A 


A~  B 


A~lB 


IX  =  A 
X  —  A 


A~lB 


A~lB 


Hence,  the  solutions  X  to  the  system  are  given  by  X  =  A  lB.  Since  we  assume  that  A  1  exists,  we  can  use 
the  formula  for  A-1  given  above.  Substituting  this  formula  into  the  equation  for  X,  we  have 


Let  x i  be  the  ith  entry  of  X  and  bj  be  the  jth  entry  of  B.  Then  this  equation  becomes 


where  adj  (A)ij  is  the  i jth  entry  of  adj  (A). 

By  the  formula  for  the  expansion  of  a  determinant  along  a  column, 


*  ...  b i  •  ■  •  * 


where  here  the  ith  column  of  A  is  replaced  with  the  column  vector  \b\  — ,bn]T .  The  determinant  of  this 
modified  matrix  is  taken  and  divided  by  det  (A).  This  formula  is  known  as  Cramer’s  rule. 


We  formally  define  this  method  now. 
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We  illustrate  this  procedure  in  the  following  example. 


Solution.  We  will  use  method  outlined  in  Procedure  3.44  to  find  the  values  for  x,y,z  which  give  the  solution 
to  this  system.  Let 


B  = 


1 

2 

3 


In  order  to  find  x,  we  calculate 


del (A  |  j 
det(A) 


where  A  i  is  the  matrix  obtained  from  replacing  the  first  column  of  A  with  B. 


Hence,  A\  is  given  by 


A]  = 


1  2  1 

2  2  1 

3-3  2 


det(Ai) 

1  2  1 

2  2  1 

3-3  2 

det  (A) 

1  2  1 

3  2  1 

2-3  2 

Therefore, 
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Similarly,  to  find  y  we  construct  Aj  by  replacing  the  second  column  of  A  with  B.  Hence,  A  2  is  given  by 


M  = 


1  1  1 
3  2  1 
2  3  2 


Therefore, 


det  (A2) 
det  (A) 


1  1  1 
3  2  1 
2  3  2 

1  2  r 

3  2  1 

2-3  2 


1 

7 


Similarly,  A3  is  constructed  by  replacing  the  third  column  of  A  with  B.  Then,  A3  is  given  by 


^3 


1  2  1 

3  2  2 

2-3  3 


Therefore,  z  is  calculated  as  follows. 


det(A3)  

1  2  1 

3  2  2 
2-3  3 

11 

det  (A) 

1  2  1 

14 

3  2  1 

2-3  2 

* 


Cramer’s  Rule  gives  you  another  tool  to  consider  when  solving  a  system  of  linear  equations. 

We  can  also  use  Cramer’s  Rule  for  systems  of  non  linear  equations.  Consider  the  following  system 
where  the  matrix  A  has  functions  rather  than  numbers  for  entries. 


Example  3.46:  Use  Cramer’s  Rule  for  Non-Constant  Matrix 


Solve  for  z  if 


"  1  0  0 

X 

■  1  ■ 

0  <??  cost  efsin  t 

0  — e'sin t  ef  cost 

y 

_  z  _ 

t 

t2 
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Solution.  We  are  asked  to  find  the  value  of  z  in  the  solution.  We  will  solve  using  Cramer’s  rule.  Thus 


1  0  1 
0  ef  cos  t  t 

0  — <?fsin  t  t2 

1  o  o- 

0  e1  cost  efsin  t 
0  — ersin t  ef  cost 


t ((cost) t  + sin t)e  ' 


* 


3.2.3.  Polynomial  Interpolation 


In  studying  a  set  of  data  that  relates  variables  x  and  y,  it  may  be  the  case  that  we  can  use  a  polynomial  to 
“fit”  to  the  data.  If  such  a  polynomial  can  be  established,  it  can  be  used  to  estimate  values  of  x  and  y  which 
have  not  been  provided. 

Consider  the  following  example. 


Example  3.47:  Polynomial  Interpolation 


Given  data  points  (1,4),  (2,9),  (3, 12),  find  an  interpolating  polynomial  p(x)  of  degree  at  most  2  and 
then  estimate  the  value  corresponding  to  x— 


Solution.  We  want  to  find  a  polynomial  given  by 

p(x)  =  r0  +  r\x  i  +  r2x2 

such  that  p(  1)  =  4,  p(2)  —  9  and  p( 3)  =  12.  To  find  this  polynomial,  substitute  the  known  values  in  for  x 
and  solve  for  ro,  r\ .  and  r2. 

p(  1)  =  r0  +  ri  +  r2  =  4 

p(  2)  =  r0  +  2ri  +4r2  =  9 
p(  3)  =  r0  +  3ri  +9r2  =  12 

Writing  the  augmented  matrix,  we  have 


"  1 

1 

1 

4  ' 

1 

2 

4 

9 

1 

3 

9 

12 

After  row  operations,  the  resulting  matrix  is 


"  1 

0 

0 

-3  " 

0 

1 

0 

8 

0 

0 

1 

-1 
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Therefore  the  solution  to  the  system  is  ro  =  —  2>,r\  —  8.  =  —1  and  the  required  interpolating  polyno¬ 

mial  is 

p(x)  —  —  3  +  —  x2 

To  estimate  the  value  for  x  =  we  calculate 


-3  +  8(I)-(I) 
—  4 


3 

4 


2 


4 

This  procedure  can  be  used  for  any  number  of  data  points,  and  any  degree  of  polynomial.  The  steps 
are  outlined  below. 


This  procedure  motivates  the  following  theorem. 
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Theorem  3.49:  Polynomial  Interpolation 


Given  n  data  points  (xi,yi),  (x2,y2),  ■  ■  ■ ,  (xn,yn)  with  the  x,-  distinct,  there  is  a  unique  polynomial 
p(x)  =  ro  +  r\x  +  r2X2  H - b  r„_ ix”_ 1  such  that  p(x,)  —  yi  for  i—  1 , 2,  •  ■  ■ ,  n.  The  resulting  polyno¬ 

mial  p(x)  is  called  the  interpolating  polynomial  for  the  data  points. 


We  conclude  this  section  with  another  example. 


Example  3.50:  Polynomial  Interpolation 


Consider  the  data  points  (0, 1),  (1,2),  (3,22),  (5,66).  Find  an  interpolating  polynomial  p(x)  of  de¬ 
gree  at  most  three,  and  estimate  the  value  of  p(2) . 


Solution.  The  desired  polynomial  p(x)  is  given  by: 

p(x)  —  r0  +  r\  x  +  r2x2  +  r3x3 

Using  the  given  points,  the  system  of  equations  is 

p(  0)  =  d)  =  1 
p{  1)  =  r0  +  ri  +  r2  +  r3  =  2 
p(  3)  =  r0  +  3ri  +9r2  +  27r3  =  22 
p(  5)  =  ro  +  5ri +25^2  + 125r3  =  66 

The  augmented  matrix  is  given  by: 


'  1 

0 

0 

0 

1  " 

1 

1 

1 

1 

2 

1 

3 

9 

27 

22 

1 

5 

25 

125 

66 

The  resulting  matrix  is 


"  1 

0 

0 

0 

1 ' 

0 

1 

0 

0 

-2 

0 

0 

1 

0 

3 

0 

0 

0 

1 

0 

Therefore,  ro  =  l,rj  =  —2 ,r2  —  3,r2  —  0  and  p(x)  =  1  —  2x+  3x2.  To  estimate  the  value  of  p( 2),  we 
compute  p( 2)  =  1  —2(2)  +  3(22)  =  1—4+  12  =  9.  4 


3.2.  Applications  of  the  Determinant  141 


Exercises 


Exercise  3.2.1  Let 


A  = 


1  2  3 
0  2  1 
3  1  0 


Determine  whether  the  matrix  A  has  an  inverse  by  finding  whether  the  determinant  is  non  zero.  If  the 
determinant  is  nonzero,  find  the  inverse  using  the  formula  for  the  inverse  which  involves  the  cofactor 
matrix. 


Exercise  3.2.2  Let 


A  = 


1  2  0 
0  2  1 
3  1  1 


Determine  whether  the  matrix  A  has  an  inverse  by  finding  whether  the  determinant  is  non  zero.  If  the 
determinant  is  nonzero,  find  the  inverse  using  the  formula  for  the  inverse. 


Exercise  3.2.3  Let 


A  = 


1  3  3 

2  4  1 
0  1  1 


Determine  whether  the  matrix  A  has  an  inverse  by  finding  whether  the  determinant  is  non  zero.  If  the 
determinant  is  nonzero,  find  the  inverse  using  the  formula  for  the  inverse. 


Exercise  3.2.4  Let 


A  = 


1  2  3 
0  2  1 

2  6  7 


Determine  whether  the  matrix  A  has  an  inverse  by  finding  whether  the  determinant  is  non  zero.  If  the 
determinant  is  nonzero,  find  the  inverse  using  the  formula  for  the  inverse. 


Exercise  3.2.5  Let 


A  = 


1  0  3 
1  0  1 
3  1  0 


Determine  whether  the  matrix  A  has  an  inverse  by  finding  whether  the  determinant  is  non  zero.  If  the 
determinant  is  nonzero,  find  the  inverse  using  the  formula  for  the  inverse. 


Exercise  3.2.6  For  the  following  matrices,  determine  if  they  are  invertible.  If  so,  use  the  formula  for  the 
inverse  in  terms  of  the  cofactor  matrix  to  find  each  inverse.  If  the  inverse  does  not  exist,  explain  why. 
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(b) 


(c) 


1  2  3 
0  2  1 
4  1  1 

1  2  1 

2  3  0 
0  1  2 


Exercise  3.2.7  Consider  the  matrix 


A  = 


1  0  0 

0  cost  —  sint 
0  sint  cost 


Does  there  exist  a  value  of  t  for  which  this  matrix  fails  to  have  an  inverse?  Explain. 


Exercise  3.2.8  Consider  the  matrix 


A  = 


1  t 
0  1 
t  0 


Does  there  exist  a  value  of  t  for  which  this  matrix  fails  to 


Exercise  3.2.9  Consider  the  matrix 


A  = 


e 1  cosht 
e1  sinht 
el  cosht 


Does  there  exist  a  value  of  t  for  which  this  matrix  fails  to 


Exercise  3.2.10  Consider  the  matrix 


e~f cost 

— e~‘  cost  —  e~*  sint 
2e~r  sint 


Does  there  exist  a  value  of  t  for  which  this  matrix  fails  to 


It 

2 

have  an  inverse?  Explain. 

sinht 

cosht 

sinht 

have  an  inverse?  Explain. 


sint 

— e~f  sint  +  e~fcost 
— 2e_/ cost 

have  an  inverse?  Explain. 


Exercise  3.2.11  Show  that  //'  dct  (A )  f  0  for  A  an  n  x  n  matrix,  it  follows  that  if  AX  =  0,  then  X  =  0. 

Exercise  3.2.12  Suppose  A,B  are  n  x  n  matrices  and  that  AB  —  L  Show  that  then  BA  —  L  Hint:  First 
explain  why  det  ( A ) ,  det  (B)  are  both  nonzero.  Then  (AB)  A  — A  and  then  show  BA  (BA  —  I)  —  0.  From  this 
use  what  is  given  to  conclude  A  (BA  —  I)  —  0.  Then  use  Problem  3.2.11. 


Exercise  3.2.13  Use  the  formula  for  the  inverse  in  terms  of  the  cofactor  matrix  to  find  the  inverse  of  the 
matrix 

V  0  O' 


A 


0  e1  cost  efsint 

0  ex  cost  —  e1  sint  ex  cost  +  e(  sint 
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Exercise  3.2.14  Find  the  inverse,  if  it  exists,  of  the  matrix 


A  = 


e1  cos  t  sint 
el  —sin  t  cos  t 
el  —cos  t  —sin  t 


Exercise  3.2.15  Suppose  A  is  an  upper  triangular  matrix.  Show  that  A1  exists  if  and  only  if  all  elements 
of  the  main  diagonal  are  non  zero.  Is  it  true  that  A-1  will  also  be  upper  triangular?  Explain.  Could  the 
same  be  concluded  for  lower  triangular  matrices? 

Exercise  3.2.16  If  A.  B,  and  C  are  each  n  x  n  matrices  and  ABC  is  invertible,  show  why  each  ofA,B,  and 
C  are  invertible. 

Exercise  3.2.17  Decide  if  this  statement  is  true  or  false:  Cramer’s  rule  is  useful  for  finding  solutions  to 
systems  of  linear  equations  in  which  there  is  an  infinite  set  of  solutions. 

Exercise  3.2.18  Use  Cramer’s  rule  to  find  the  solution  to 

x +  2 y  —  1 
2x  —  y  =  2 


Exercise  3.2.19  Use  Cramer’s  rule  to  find  the  solution  to 


x  +  2y  +  z  =  1 
2x  —  y  —  z  —  2 
x  +  z  =  1 


4.  R" 


4.1  Vectors  in  R" 


Outcomes 


A.  Find  the  position  vector  of  a  point  in 


The  notation  Rn  refers  to  the  collection  of  ordered  lists  of  n  real  numbers,  that  is 

RM  =  { (*1  •  •  -xn)  :  Xj  G  R  for  j  =  1,  •  •  •  ,n} 

In  this  chapter,  we  take  a  closer  look  at  vectors  in  R”.  First,  we  will  consider  what  R”  looks  like  in  more 
detail.  Recall  that  the  point  given  by  0  =  (0,  ■  •  • ,  0)  is  called  the  origin. 

Now,  consider  the  case  of  R'!  for  n—  1.  Then  from  the  definition  we  can  identify  R  with  points  in  R1 
as  follows: 

R  =  R1  =  {(xi)  :  x\  G  R} 

Hence,  R  is  defined  as  the  set  of  all  real  numbers  and  geometrically,  we  can  describe  this  as  all  the  points 
on  a  line. 

Now  suppose  n  =  2.  Then,  from  the  definition, 

R2  =  { (xi,JC2)  :  Xj  e  R  for  j  —  1,2} 

Consider  the  familiar  coordinate  plane,  with  an  x  axis  and  a  y  axis.  Any  point  within  this  coordinate  plane 
is  identified  by  where  it  is  located  along  the  x  axis,  and  also  where  it  is  located  along  the  y  axis.  Consider 
as  an  example  the  following  diagram. 


y  t 


Q  =  ( 

-3,4) 

4 

► -  4 - 

P  = 

(2,1) 

1 

4 

» 

3 

/* 

> 
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Hence,  every  element  in  M2  is  identified  by  two  components,  x  and  y,  in  the  usual  manner.  The 
coordinates  x,y  (or  xi,X2)  uniquely  determine  a  point  in  the  plan.  Note  that  while  the  definition  uses  x\  and 
X2  to  label  the  coordinates  and  you  may  be  used  to  x  and  y,  these  notations  are  equivalent. 

Now  suppose  n  —  3.  You  may  have  previously  encountered  the  3-dimensional  coordinate  system,  given 
by 

M3  =  {(xi,X2,X3)  :  Xj  G  M  for  j  —  1,2,3} 

Points  in  M3  will  be  determined  by  three  coordinates,  often  written  (x,y,z)  which  correspond  to  the  x, 
y,  and  z  axes.  We  can  think  as  above  that  the  first  two  coordinates  determine  a  point  in  a  plane.  The  third 
component  determines  the  height  above  or  below  the  plane,  depending  on  whether  this  number  is  positive 
or  negative,  and  all  together  this  determines  a  point  in  space.  You  see  that  the  ordered  triples  correspond  to 
points  in  space  just  as  the  ordered  pairs  correspond  to  points  in  a  plane  and  single  real  numbers  correspond 
to  points  on  a  line. 

The  idea  behind  the  more  general  R"  is  that  we  can  extend  these  ideas  beyond  n  —  3.  This  discussion 
regarding  points  in  RM  leads  into  a  study  of  vectors  in  R” .  While  we  consider  R'!  for  all  n,  we  will  largely 
focus  on  n  —  2, 3  in  this  section. 

Consider  the  following  definition. 


For  this  reason  we  may  write  both  P  —  (pi,  •  •  •  ,pn)  G  M”  and  Op  —  \p\-  ■■  pn]T  G  M". 
This  definition  is  illustrated  in  the  following  picture  for  the  special  case  of  R3. 


P=(Pl,P2,P3) 
0t=[pi  P2  P3  ] 


Thus  every  point  P  in  R"  determines  its  position  vector  ofi.  Conversely,  every  such  position  vector  ofi 
which  has  its  tail  at  0  and  point  at  P  determines  the  point  P  of  R”. 


Now  suppose  we  are  given  two  points,  P,Q  whose  coordinates  are  (pi,---  ,p„)  and  (q i,---  ,q„)  re¬ 
spectively.  We  can  also  determine  the  position  vector  from  P  to  Q  (also  called  the  vector  from  P  to  Q) 
defined  as  follows. 


qi-pi 


=  o&-oP 


q>i  Pn 
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Now,  imagine  taking  a  vector  in  Rn  and  moving  it  around,  always  keeping  it  pointing  in  the  same 
direction  as  shown  in  the  following  picture. 


After  moving  it  around,  it  is  regarded  as  the  same  vector.  Each  vector,  OP  and  AB  has  the  same  length 
(or  magnitude)  and  direction.  Therefore,  they  are  equal. 

Consider  now  the  general  definition  for  a  vector  in  R". 


Definition  4.2:  Vectors  in  R" 

1 

Let  Wl  —  {(xi,  •  •  •  ,xn)  :  xj  e  R  for  j  =  1,  •  •  •  ,n} 

.  Thei 

Xl 

l 

x  — 

X/i 

is  called  a  vector.  Vectors  have  both  size  (magnitude)  and  direction.  The  numbers  xj  are  called  the 

components  ofx. 

Using  this  notation,  we  may  use  p  to  denote  the  position  vector  of  point  P.  Notice  that  in  this  context, 
p  =  0 P.  These  notations  may  be  used  interchangeably. 

You  can  think  of  the  components  of  a  vector  as  directions  for  obtaining  the  vector.  Consider  n  —  3. 
Draw  a  vector  with  its  tail  at  the  point  (0,0,0)  and  its  tip  at  the  point  ( a,b,c ).  This  vector  it  is  obtained 
by  starting  at  (0,0,0),  moving  parallel  to  the  x  axis  to  (a, 0,0)  and  then  from  here,  moving  parallel  to  the 
y  axis  to  (a,b,  0)  and  finally  parallel  to  the  z  axis  to  ( a,b,c ) .  Observe  that  the  same  vector  would  result  if 
you  began  at  the  point  ( d,e,f ),  moved  parallel  to  the  x  axis  to  (d  +  a,  e,f) ,  then  parallel  to  the  y  axis  to 
(d  +  a,e  +  b,f),  and  finally  parallel  to  the  z  axis  to  [d  +  a,  e  +  b,f  +  c) .  Here,  the  vector  would  have  its 
tail  sitting  at  the  point  determined  by  A  —  (d,  e,f)  and  its  point  at  B  —  (d  +  a,e  +  b,f  +  c) .  It  is  the  same 
vector  because  it  will  point  in  the  same  direction  and  have  the  same  length.  It  is  like  you  took  an  actual 
arrow,  and  moved  it  from  one  location  to  another  keeping  it  pointing  the  same  direction. 

We  conclude  this  section  with  a  brief  discussion  regarding  notation.  In  previous  sections,  we  have 
written  vectors  as  columns,  or  n  x  1  matrices.  For  convenience  in  this  chapter  we  may  write  vectors  as  the 
transpose  of  row  vectors,  or  1  x  n  matrices.  These  are  of  course  equivalent  and  we  may  move  between 
both  notations.  Therefore,  recognize  that 


2 

3 


3] 


T 


— *  r  rp  rr 

Notice  that  two  vectors  u  =  [u\  ■  ■  ■  u„\  and  v  —  [vj  •  ■  •  vn]  are  equal  if  and  only  if  all  corresponding 
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components  are  equal.  Precisely, 


u  —  v  if  and  only  if 
Uj  —  vj  for  all  j  =  1,  •  •  •  ,n 


Thus  [  1  2  4  ] T  g  R3  and  [2  1  4  ] T  e  R3  but  [  1  2  4  ] r  ^4  [  2  1  4  ] r  because,  even  though 
the  same  numbers  are  involved,  the  order  of  the  numbers  is  different. 

For  the  specific  case  of  R3,  there  are  three  special  vectors  which  we  often  use.  They  are  given  by 


1=  [  1  0  0  ]T 


J=  [  0  1  Of 
k=[  0  0  1  ]r 

We  can  write  any  vector  u  —  [  u\  m  M3  ]  as  a  linear  combination  of  these  vectors,  written  as  u  — 
u  1  i  +  u2j  +  u,2k.  This  notation  will  be  used  throughout  this  chapter. 


4.2  Algebra  in  Wl 


Outcomes 


A.  Understand  vector  addition  and  scalar  multiplication,  algebraically. 

B.  Introduce  the  notion  of  linear  combination  of  vectors. 


Addition  and  scalar  multiplication  are  two  important  algebraic  operations  done  with  vectors.  Notice 
that  these  operations  apply  to  vectors  in  R'!,  for  any  value  of  n.  We  will  explore  these  operations  in  more 
detail  in  the  following  sections. 

4.2.1.  Addition  of  Vectors  in  W2 


Addition  of  vectors  in  R"  is  defined  as  follows. 
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To  add  vectors,  we  simply  add  corresponding  components.  Therefore,  in  order  to  add  vectors,  they 
must  be  the  same  size. 

Addition  of  vectors  satisfies  some  important  properties  which  are  outlined  in  the  following  theorem. 


The  additive  identity  shown  in  equation  4.1  is  also  called  the  zero  vector,  the  n  x  1  vector  in  which 
all  components  are  equal  to  0.  Further,  —  u  is  simply  the  vector  with  all  components  having  same  value  as 
those  of  u  but  opposite  sign;  this  is  just  (— 1  )u.  This  will  be  made  more  explicit  in  the  next  section  when 
we  explore  scalar  multiplication  of  vectors.  Note  that  subtraction  is  defined  as  u  —  v  =  u+  (— v)  . 
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4.2.2.  Scalar  Multiplication  of  Vectors  in  Rn 


Scalar  multiplication  of  vectors  in  R"  is  defined  as  follows. 


Just  as  with  addition,  scalar  multiplication  of  vectors  satisfies  several  important  properties.  These  are 
outlined  in  the  following  theorem. 


Proof.  We  will  show  the  proof  of: 


k(u  +  v)  =  ku  +  kv 


Note  that: 


k(u  +  v )  =  k[u\  +vi  ■  ■  -un  +  vn\ 

=  [k(ui+vi)---k(un  +  vn)\ 
=  [ ku\  +  •  •  •  kun  +  kvn]T 

—  [ kui  ■  ■  ■  kun ] 1  +  [kv i  •  •  •  kvn\ 
=  ku  +  kv 


4 
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We  now  present  a  useful  notion  you  may  have  seen  earlier  combining  vector  addition  and  scalar  mul¬ 
tiplication 


For  example, 


Thus  we  can  say  that 


"  -4  ' 

'  -3  ' 

'  -18  " 

1 

+  2 

0 

= 

3 

0 

1 

2 

-18 


v  — 


3 

2 


is  a  linear  combination  of  the  vectors 


Ml  = 

"  -4  ' 
1 

and  = 

'  -3  ' 
0 

0 

1 

Exercises 


Exercise  4.2.1  Find  —3 


Exercise  4.2.2  Find  —7 


5 

-8 

-1 

+  5 

2 

2 

-3 

-3  _ 

6  _ 

6  " 

'  -13 

0 

+  6 

-1 

4 

1 

-1 

6 

Exercise  4.2.3  Decide  whether 


is  a  linear  combination  of  the  vectors 


v 


4 

4 

-3 


3  ' 

2  " 

Ml  = 

1 

-1 

and  u.2  — 

-2 

1 
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Exercise  4.2.4  Decide  whether 


is  a  linear  combination  of  the  vectors 


v  = 


4 

4 

4 


3  ' 

2  ' 

U\  — 

1 

-1 

and  U2  = 

-2 

1 

4.3  Geometric  Meaning  of  Vector  Addition 


Recall  that  an  element  of  W1  is  an  ordered  list  of  numbers.  For  the  specific  case  of  n  —  2,3  this  can 
be  used  to  determine  a  point  in  two  or  three  dimensional  space.  This  point  is  specified  relative  to  some 
coordinate  axes. 

Consider  the  case  n  —  3.  Recall  that  taking  a  vector  and  moving  it  around  without  changing  its  length  or 
direction  does  not  change  the  vector.  This  is  important  in  the  geometric  representation  of  vector  addition. 

Suppose  we  have  two  vectors,  u  and  v  in  M3.  Each  of  these  can  be  drawn  geometrically  by  placing  the 
tail  of  each  vector  at  0  and  its  point  at  (ui, 112,113)  and  (vi,V2,V3)  respectively.  Suppose  we  slide  the  vector 
v  so  that  its  tail  sits  at  the  point  of  u.  We  know  that  this  does  not  change  the  vector  v.  Now,  draw  a  new 
vector  from  the  tail  of  u  to  the  point  of  v.  This  vector  is  u  +  v. 

The  geometric  significance  of  vector  addition  in  M"  for  any  n  is  given  in  the  following  definition. 


This  definition  is  illustrated  in  the  following  picture  in  which  if  +  v  is  shown  for  the  special  case  n  —  3. 
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Notice  the  parallelogram  created  by  u  and  v  in  the  above  diagram.  Then  u  +  v  is  the  directed  diagonal 
of  the  parallelogram  determined  by  the  two  vectors  u  and  v. 

When  you  have  a  vector  v,  its  additive  inverse  —  v  will  be  the  vector  which  has  the  same  magnitude  as 
v  but  the  opposite  direction.  When  one  writes  u  —  v,  the  meaning  is  u  +  (— v)  as  with  real  numbers.  The 
following  example  illustrates  these  definitions  and  conventions. 


r 

Example  4.9:  Graphing  Vector  Addition 

1 

Consider  the  following  picture  of  vectors  u  and  v. 

Sketch  a  picture  of  u  +  v, u  —  v. 

Solution.  We  will  first  sketch  u  +  v.  Begin  by  drawing  u  and  then  at  the  point  of  u,  place  the  tail  of  v  as 
shown.  Then  U  +  v  is  the  vector  which  results  from  drawing  a  vector  from  the  tail  of  u  to  the  tip  of  v. 


Next  consider  u  —  v.  This  means  u  +  (— v) .  From  the  above  geometric  description  of  vector  addition, 
—v  is  the  vector  which  has  the  same  length  but  which  points  in  the  opposite  direction  to  v.  Here  is  a 
picture. 
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4.4  Length  of  a  Vector 


Outcomes 


A.  Find  the  length  of  a  vector  and  the  distance  between  two  points  in  R'!. 

B.  Find  the  coiresponding  unit  vector  to  a  vector  in  R". 


In  this  section,  we  explore  what  is  meant  by  the  length  of  a  vector  in  R".  We  develop  this  concept  by 
first  looking  at  the  distance  between  two  points  in  R”. 

First,  we  will  consider  the  concept  of  distance  for  R,  that  is,  for  points  in 
between  two  points  P  and  Q  is  given  by  the  absolute  value  of  their  difference, 
between  P  and  Q  by  d  (P,  Q )  which  is  defined  as 

d(P,Q)  =  sJ(P-Q)2  (4.2) 

Consider  now  the  case  for  n  =  2,  demonstrated  by  the  following  picture. 

P={p\,P2) 


Ol,92) 

There  are  two  points  P  =  {p\,P2)  and  Q  =  (q\,q2)  in  the  plane.  The  distance  between  these  points 
is  shown  in  the  picture  as  a  solid  line.  Notice  that  this  line  is  the  hypotenuse  of  a  right  triangle  which 
is  half  of  the  rectangle  shown  in  dotted  lines.  We  want  to  find  the  length  of  this  hypotenuse  which  will 
give  the  distance  between  the  two  points.  Note  the  lengths  of  the  sides  of  this  triangle  are  \p\  —  q\ |  and 
\p2  ~  qi\,  the  absolute  value  of  the  difference  in  these  values.  Therefore,  the  Pythagorean  Theorem  implies 
the  length  of  the  hypotenuse  (and  thus  the  distance  between  P  and  Q)  equals 

(\p\-qt\1 +  \p2-qif)  1  =  ((pi -<?i)2  +  (p2-<?2)2)  /  (4.3) 

Now  suppose  n  =  3  and  let  P  —  {p\,pi,pi)  and  Q  —  (qi,q2,q3)  be  two  points  in  R3.  Consider  the 
following  picture  in  which  the  solid  line  joins  the  two  points  and  a  dotted  line  joins  the  points  (qi,q2,q3) 
and  (pi,P2,q3)- 


R1.  Here,  the  distance 
We  denote  the  distance 
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Here,  we  need  to  use  Pythagorean  Theorem  twice  in  order  to  find  the  length  of  the  solid  line.  First,  by 
the  Pythagorean  Theorem,  the  length  of  the  dotted  line  joining  {q\,q2,q?,)  and  (pi,p2,qi)  equals 

((pi  ~q\)2 +  {pi-qif)  / 


while  the  length  of  the  line  joining  (p\,p2,  <73)  to  (pi,P2,P3)  is  just  \pi  —  <73 1 .  Therefore,  by  the  Pythagorean 
Theorem  again,  the  length  of  the  line  joining  the  points  P  —  (p\,P2,P?,)  and  Q  =  (<71,  <72- <73)  equals 

({{p\-q\)2 +  {p2-qi)1')  1  +(>3-43)2 

=  ((pi  -q\)2 +  {p2-qif +  {p?,  -q-if)  /  (4.4) 

This  discussion  motivates  the  following  definition  for  the  distance  between  points  in  M.n. 


From  the  above  discussion,  you  can  see  that  Definition  4.10  holds  for  the  special  cases  n  =  1,2,3,  as 
in  Equations  4.2,  4.3,  4.4.  In  the  following  example,  we  use  Definition  4.10  to  find  the  distance  between 
two  points  in  M4. 
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Solution.  We  will  use  the  formula  given  in  Definition  4.10  to  find  the  distance  between  P  and  Q.  Use  the 
distance  formula  and  write 

i 

d(P,Q)=  ((l-2)2  +  (2-3)2  +  (-4  — (  — 1))2  +  (6  — 0)2)3  =47 
Therefore,  d(P,Q )  =  \/47. 

* 

There  are  certain  properties  of  the  distance  between  points  which  are  important  in  our  study.  These  are 
outlined  in  the  following  theorem. 


There  are  many  applications  of  the  concept  of  distance.  For  instance,  given  two  points,  we  can  ask  what 
collection  of  points  are  all  the  same  distance  between  the  given  points.  This  is  explored  in  the  following 
example. 


Example  4.13:  The  Plane  Between  Two  Points 


Describe  the  points  in  M3  which  are  at  the  same  distance  between  (1,2,3)  and  (0, 1,2) . 


Solution.  Let  P  —  (pi,P2,P3)  be  such  a  point.  Therefore,  P  is  the  same  distance  from  (1,2,3)  and  (0, 1,2) . 
Then  by  Definition  4.10, 

\J {pi  -  l)2  +  (p2  -  2)2  +  (p3  -  3)2  =  \j (pi  -  0)2  +  (p2  -  l)2  +  (p3  -  2)2 
Squaring  both  sides  we  obtain 

(Pi  ~  l)2  +  (P2  -  2)2  +  (P3  ~  3)2  =  pj  +  (P2  -  l)2  +  (P3  ~  2)2 

and  so 

Pi  ~  2pi  +  l4  +  p2~4p2  +  P3-6p3  =  pj  +  pj-2p2  +  5  +  p3-4p3 
Simplifying,  this  becomes 

— 2pi  +  14  —  4p2  —  6p3  —  —2p2  +  5  —  4p3 

which  can  be  written  as 

2  pi  +  2p2  +  2p3  =  -9  (4.5) 

Therefore,  the  points  P  —  (pi,P2,p3)  which  are  the  same  distance  from  each  of  the  given  points  form  a 
plane  whose  equation  is  given  by  4.5.  4 


158  Rn 


We  can  now  use  our  understanding  of  the  distance  between  two  points  to  define  what  is  meant  by  the 
length  of  a  vector.  Consider  the  following  definition. 


This  definition  corresponds  to  Definition  4.10,  if  you  consider  the  vector  u  to  have  its  tail  at  the  point 
0  =  ((),•••  ,0)  and  its  tip  at  the  point  U  —  ,un).  Then  the  length  of  wis  equal  to  the  distance  between 

0  and  U,d(0,U).  In  general,  d(P,Q )  =  ||P^||. 

Consider  Example  4.11.  By  Definition  4.14,  we  could  also  find  the  distance  between  P  and  Q  as  the 
length  of  the  vector  connecting  them.  Hence,  if  we  were  to  draw  a  vector  PQ  with  its  tail  at  P  and  its  point 
at  Q,  this  vector  would  have  length  equal  to  a/47. 

We  conclude  this  section  with  a  new  definition  for  the  special  case  of  vectors  of  length  1. 


Let  v  be  a  vector  in  W .  Then,  the  vector  u  which  has  the  same  direction  as  v  but  length  equal  to  1  is 
the  corresponding  unit  vector  of  v.  This  vector  is  given  by 

1  _ 

u  =  jp|j'v 

We  often  use  the  term  normalize  to  refer  to  this  process.  When  we  normalize  a  vector,  we  find  the 
corresponding  unit  vector  of  length  1.  Consider  the  following  example. 


Solution.  We  will  use  Definition  4.15  to  solve  this.  Therefore,  we  need  to  find  the  length  of  v  which,  by 
Definition  4.14  is  given  by 

llvll  =  \JA  +  A  +  A 

Using  the  corresponding  values  we  find  that 


v 


\j  12  +  (-3)2  +  42 
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=  Vl  +  9+16 

=  Vie 


In  order  to  find  u,  we  divide  v  by  Vie.  The  result  is 


u  — 


Vie 


[  1  -3  4]: 


1 _ 3_  _4_ 

V26  V26  V26 


You  can  verify  using  the  Definition  4.14  that  ||w||  =  1. 


* 


4.5  Geometric  Meaning  of  Scalar  Multiplication 


Recall  that  the  point  P  =  (p\ ,pi,pV)  determines  a  vector  p  from  0  to  P.  The  length  of  p.  denoted  ||/5||, 
is  equal  to  y/ 'p\  +  p\  +  p\  by  Definition  4.10. 

Now  suppose  we  have  a  vector  u  =  [  u\  U2  113  ]  and  we  multiply  u  by  a  scalar  k.  By  Definition 
4.5,  ku  —  [  ku\  ku2  ku 3  ] T .  Then,  by  using  Definition  4.10,  the  length  of  this  vector  is  given  by 

J {[ku\)2  +  (km)2  +  (ku3)2^j  =  |^:|  \J u\  +  u\  +  M3 

Thus  the  following  holds. 

\\ku\\  —  |^|  ||m|| 

In  other  words,  multiplication  by  a  scalar  magnifies  or  shrinks  the  length  of  the  vector  by  a  factor  of  \k\. 
If  |*|  >  1,  the  length  of  the  resulting  vector  will  be  magnified.  If  \k\  <  1,  the  length  of  the  resulting  vector 
will  shrink.  Remember  that  by  the  definition  of  the  absolute  value,  |k|  >  0. 

What  about  the  direction?  Draw  a  picture  of  li  and  ku  where  k  is  negative.  Notice  that  this  causes  the 
resulting  vector  to  point  in  the  opposite  direction  while  if  k  >  0  it  preserves  the  direction  the  vector  points. 
Therefore  the  direction  can  either  reverse,  if  k  <  0,  or  remain  preserved,  if  k  >  0. 

Consider  the  following  example. 
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Example  4.17:  Graphing  Scalar  Multiplication 

- 1 

Consider  the  vectors  u  and  v  drawn  below. 

Draw  — u,  2v,  and—\v. 

Solution. 

In  order  to  find  —u,  we  preserve  the  length  of  u  and  simply  reverse  the  direction.  For  2v,  we  double 
the  length  of  v,  while  preserving  the  direction.  Finally  is  found  by  taking  half  the  length  of  v  and 
reversing  the  direction.  These  vectors  are  shown  in  the  following  diagram. 


* 

Now  that  we  have  studied  both  vector  addition  and  scalar  multiplication,  we  can  combine  the  two 
actions.  Recall  Definition  9.12  of  linear  combinations  of  column  matrices.  We  can  apply  this  definition  to 
vectors  in  W\  A  linear  combination  of  vectors  in  M'7  is  a  sum  of  vectors  multiplied  by  scalars. 

In  the  following  example,  we  examine  the  geometric  meaning  of  this  concept. 


— 

Example  4.18:  Graphing  a  Linear  Combination  of  Vectors 

- ^ 

Consider  the  following  picture  of  the  vectors  u  and  v 

u 

A  s 

Sketch  a  picture  ofu  +  2v,u—^v. 

Solution.  The  two  vectors  are  shown  below. 
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4.6  Parametric  Lines 


We  can  use  the  concept  of  vectors  and  points  to  find  equations  for  arbitrary  lines  in  R'1,  although  in 
this  section  the  focus  will  be  on  lines  in  R3. 

To  begin,  consider  the  case  n  =  1  so  we  have  R1  =  R.  There  is  only  one  line  here  which  is  the  familiar 
number  line,  that  is  R  itself.  Therefore  it  is  not  necessary  to  explore  the  case  of  n  —  1  further. 

Now  consider  the  case  where  n  —  2,  in  other  words  R2.  Let  P  and  Pq  be  two  different  points  in  R2 
which  are  contained  in  a  line  L.  Let  p  and  po  be  the  position  vectors  for  the  points  P  and  Pq  respectively. 
Suppose  that  Q  is  an  arbitrary  point  on  L.  Consider  the  following  diagram. 


Our  goal  is  to  be  able  to  define  Q  in  terms  of  P  and  Po-  Consider  the  vector  PqP  —  p  —  Pq  which  has  its 
tail  at  Po  and  point  at  P.  If  we  add  p  —  po  to  the  position  vector  po  for  Po,  the  sum  would  be  a  vector  with 
its  point  at  P.  In  other  words, 


P  =  Po  +  (p-po) 
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Now  suppose  we  were  to  add  t(p  —  po)  to  p  where  t  is  some  scalar.  You  can  see  that  by  doing  so,  we 
could  find  a  vector  with  its  point  at  Q.  In  other  words,  we  can  find  t  such  that 

q  =  p0+t(p-po) 

This  equation  determines  the  line  L  in  R2.  In  fact,  it  determines  a  line  L  in  M'!.  Consider  the  following 
definition. 


Definition  4.19:  Vector  Equation  of  a  Line 


Suppose  a  line  L  in  M'!  contains  the  two  different  points  P  and  Pq.  Let  p  and  po  he  the  position 
vectors  of  these  two  points,  respectively.  Then,  L  is  the  collection  of  points  Q  which  have  the 
position  vector  q  given  by 

q  =  p0  +  t(p- po) 

where  t  £  R. 

Let  d  —  p  —  pQ.  Then  d  is  the  direction  vector  for  L  and  the  vector  equation  for  L  is  given  by 

P  —  Po  +  td,  t  £  R 


Note  that  this  definition  agrees  with  the  usual  notion  of  a  line  in  two  dimensions  and  so  this  is  consistent 
with  earlier  concepts.  Consider  now  points  in  R3.  If  a  point  P  £  R3  is  given  by  P  —  (x,y,z),  Pq  £  R3  by 
P0  =  (a'o,>'o,zo),  then  we  can  write 


X 

*0 

a 

y 

= 

yo 

+ 1 

b 

z 

.  zo  . 

c 

where  d  = 


a 

b 

c 


This  is  the  vector  equation  of  L  written  in  component  form  . 


The  following  theorem  claims  that  such  an  equation  is  in  fact  a  line. 


Proof.  Let  x\,X2  £  R”.  Define  x\  —  a  and  let  xi  —  x\  —  b.  Since  b  ^  0,  it  follows  that  .C  ^  x\.  Then 
a  +  tb  =  x\  + 1  (x2  —  x\ ) .  It  follows  that  x  —  a  +  tb  is  a  line  containing  the  two  different  points  X\  and  X2 
whose  position  vectors  are  given  by  x\  and  X2  respectively.  4 

We  can  use  the  above  discussion  to  find  the  equation  of  a  line  when  given  two  distinct  points.  Consider 
the  following  example. 


Example  4.21:  A  Line  From  Two  Points 


Find  a  vector  equation  for  the  line  through  the  points  Pq  —  (1,2,0)  and  P  =  (2,  — 4, 6) . 
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Solution.  We  will  use  the  definition  of  a  line  given  above  in  Definition  4. 19  to  write  this  line  in  the  form 

q  =  po+t(p-po) 


Let  q  = 
Then, 


x 

y 

z 


Then,  we  can  find  p  and  po  by  taking  the  position  vectors  of  points  P  and  Pq  respectively. 


q  =  Po+t(p~Po) 


can  be  written  as 


X 

'  1 ' 

l  ' 

y 

z 

— 

1 

<N  O 

+ 1 

-6 

6 

Here,  the  direction  vector 

1  ' 
-6 

is  obtained  by  p  —  po  — 

2  ' 
-4 

'  1  ' 

2 

nition  4.19. 

6 

6 

0 

as  indicated  above  in  Defi- 

* 


Notice  that  in  the  above  example  we  said  that  we  found  “a”  vector  equation  for  the  line,  not  “the” 
equation.  The  reason  for  this  terminology  is  that  there  are  infinitely  many  different  vector  equations  for 
the  same  line.  To  see  this,  replace  t  with  another  parameter,  say  3s.  Then  you  obtain  a  different  vector 
equation  for  the  same  line  because  the  same  set  of  points  is  obtained. 


In  Example  4.21,  the  vector  given  by 


1 

-6 

6 


is  the  direction  vector  defined  in  Definition  4.19.  If  we 


know  the  direction  vector  of  a  line,  as  wef 


as  a  point  on  the  line,  we  can  find  the  vector  equation. 


Consider  the  following  example. 


Solution.  We  will  use  Definition  4.19  to  write  this  line  in  the  form  p  =  po  +  td,  t  G  R. 
direction  vector  d.  In  order  to  find  po,  we  can  use  the  position  vector  of  the  point  Po. 


,  the  equation  for  the  line  is  given  by 


'  1  ' 

X 

2 

0 

.  Letting  p  = 

y 

z 

We  are  given  the 
This  is  given  by 


X 

'  1 ' 

'  l  " 

y 

z 

— 

1 

<N  O 

+ 1 

2 

1 

(4.6) 


* 
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We  sometimes  elect  to  write  a  line  such  as  the  one  given  in  4.6  in  the  form 

x  —  1  +t  'l 

y  —  2  4-  2t  >  where  (SR  (4.7) 

z  =  t  ) 

This  set  of  equations  give  the  same  information  as  4.6,  and  is  called  the  parametric  equation  of  the  line. 
Consider  the  following  definition. 


You  can  verify  that  the  form  discussed  following  Example  4.22  in  equation  4.7  is  of  the  form  given  in 
Definition  4.23. 

There  is  one  other  form  for  a  line  which  is  useful,  which  is  the  symmetric  form.  Consider  the  line 
given  by  4.7.  You  can  solve  for  the  parameter  t  to  write 


t  =  x—  1 


t  =  z 


Therefore, 


This  is  the  symmetric  form  of  the  line. 

In  the  following  example,  we  look  at  how  to  take  the  equation  of  a  line  from  symmetric  form  to 
parametric  form. 
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Solution.  We  want  to  write  this  line  in  the  form  given  by  Definition  4.23.  This  is  of  the  form 

x  —  xq  +  ta  1 

y  —  yo  +  tb  >  where  tel 
Z  —  Zo  +  tc  I 


Let  t  —  =  ti.  and  t  =  7  +  3,  as  given  in  the  symmetric  form  of  the  line.  Then  solving  for  x,y,z, 

yields 

x  =  2  +  3f  'l 

y=\+2t  >  with  tel 

z  —  —3  +  t  J 

This  is  the  parametric  equation  for  this  line. 

Now,  we  want  to  write  this  line  in  the  form  given  by  Definition  4.19.  This  is  the  form 


p  —  Po  +  td 

where  tel.  This  equation  becomes 


X 

2  ' 

'  3  " 

1 

— 

1 

-3 

+  t 

2 

1 

4 


Exercises 


Exercise  4.6.1  Find  the  vector  equation  for  the  line  through  (—7,6,0)  and  (  —  1,1,4).  Then,  find  the 
parametric  equations  for  this  line. 


Exercise  4.6.2  Find  parametric  equations  for  the  line  through  the  point  (7,7, 1)  with  a  direction  vector 

1 


d  = 


6 

2 


Exercise  4.6.3  Parametric  equations  of  the  line  are 

x  —  t  +  2 
y  —  6  —  3t 
z—  —t  —  6 


Find  a  direction  vector  for  the  line  and  a  point  on  the  line. 


Exercise  4.6.4  Find  the  vector  equation  for  the  line  through  the  two  points  (—5,5, 1),  (2,2,4) .  Then,  find 
the  parametric  equations. 
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Exercise  4.6.5  The  equation  of  a  line  in  two  dimensions  is  written  asy  —  x  —  5.  Find  parametric  equations 
for  this  line. 

Exercise  4.6.6  Find  parametric  equations  for  the  line  through  (6,5,  —2)  and  (5, 1,2) . 

Exercise  4.6.7  Find  the  vector  equation  and  parametric  equations  for  the  line  through  the  point  (—7, 10,  —6) 


with  a  direction  vector  d  — 


1 

1 

3 


Exercise  4.6.8  Parametric  equations  of  the  line  are 

x  =  2t  +  2 
y  =  5  -  At 
z  —  — t  —  3 

Find  a  direction  vector  for  the  line  and  a  point  on  the  line,  and  write  the  vector  equation  of  the  line. 

Exercise  4.6.9  Find  the  vector  equation  and  parametric  equations  for  the  line  through  the  two  points 
(4,10,0),  (1,-5, -6). 

Exercise  4.6.10  Find  the  point  on  the  line  segment  from  P  =  (—4,7,5)  to  Q—  (2, —2, —3)  which  is  A  of 
the  way  from  P  to  Q. 

Exercise  4.6.11  Suppose  a  triangle  in  R”  has  vertices  at  P\,Pi,  and  P3.  Consider  the  lines  which  are 
drawn  from  a  vertex  to  the  mid  point  of  the  opposite  side.  Show  these  three  lines  intersect  in  a  point  and 
find  the  coordinates  of  this  point. 


4.7  The  Dot  Product 


Outcomes 


A.  Compute  the  dot  product  of  vectors,  and  use  this  to  compute  vector  projections. 
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4.7.1.  The  Dot  Product 


There  are  two  ways  of  multiplying  vectors  which  are  of  great  importance  in  applications.  The  first  of  these 
is  called  the  dot  product.  When  we  take  the  dot  product  of  vectors,  the  result  is  a  scalar.  For  this  reason, 
the  dot  product  is  also  called  the  scalar  product  and  sometimes  the  inner  product.  The  definition  is  as 
follows. 


The  dot  product  u*v  is  sometimes  denoted  as  (w,  v)  where  a  comma  replaces  •.  It  can  also  be  written 
as  (u,  v).  If  we  write  the  vectors  as  column  or  row  matrices,  it  is  equal  to  the  matrix  product  vwT . 

Consider  the  following  example. 


Solution.  By  Definition  4.25,  we  must  compute 

4 

U  •  V  —  ^  UfcVfc 
k=\ 

This  is  given  by 

U.v  =  (1)(0)  +  (2)(1)  +  (0)(2)  +  (— 1)(3) 
=  0  +  2  +  0  +  -3 
=  -1 


* 


With  this  definition,  there  are  several  important  properties  satisfied  by  the  dot  product. 
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The  proof  is  left  as  an  exercise.  This  proposition  tells  us  that  we  can  also  use  the  dot  product  to  find 
the  length  of  a  vector. 


Solution.  By  Proposition  4.27,  ||w||2 
This  is  given  by 


u»u.  Therefore, 


\/u*u.  First,  compute  u  •  u. 


u.u  =  (2)  (2)  +  (1)(1)  +  (4)  (4)  +  (2)  (2) 
=  4+1  +  16  +  4 
=  25 


Then, 


Vu»u 

V25 

5 


* 


You  may  wish  to  compare  this  to  our  previous  definition  of  length,  given  in  Definition  4.14. 

The  Cauchy  Schwarz  inequality  is  a  fundamental  inequality  satisfied  by  the  dot  product.  It  is  given 
in  the  following  theorem. 
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Proof.  First  note  that  if  v  =  0  both  sides  of  4.8  equal  zero  and  so  the  inequality  holds  in  this  case.  Therefore, 
it  will  be  assumed  in  what  follows  that  v  ^  0. 

Define  a  function  of  t  e  R  by 

/  (f )  =  (u  +  tv)  •  (u  +  tv) 

Then  by  Proposition  4.27,  /(f)  >  0  for  all  t  e  R.  Also  from  Proposition  4.27 

/(f)  =  U»(U  +  tv)+tv»(u  +  tv) 

—  U»U+t  (u»v)  +  tv»u  +  t2v»v 
—  \\u\\2  +  2t(u»v)  +  \\v\\2t2 

Now  this  means  the  graph  of  y  —  f(t)  is  a  parabola  which  opens  up  and  either  its  vertex  touches  the  t 
axis  or  else  the  entire  graph  is  above  the  t  axis.  In  the  first  case,  there  exists  some  t  where  /(f)  =0  and 
this  requires  u  +  tv  =  0  so  one  vector  is  a  multiple  of  the  other.  Then  clearly  equality  holds  in  4.8.  In  the 
case  where  v  is  not  a  multiple  of  u,  it  follows  /(f)  >0  for  all  f  which  says  /(f)  has  no  real  zeros  and  so 
from  the  quadratic  formula, 

(2(u»v))2  —4\\u\\2\\v\\2  <  0 

which  is  equivalent  to  |  u  •  v  \  <  \  \  u  \  \  \  \  v  1 1 .  4 

Notice  that  this  proof  was  based  only  on  the  properties  of  the  dot  product  listed  in  Proposition  4.27. 
This  means  that  whenever  an  operation  satisfies  these  properties,  the  Cauchy  Schwarz  inequality  holds. 
There  are  many  other  instances  of  these  properties  besides  vectors  in  R”. 

The  Cauchy  Schwarz  inequality  provides  another  proof  of  the  triangle  inequality  for  distances  in  R'h 


Proof.  By  properties  of  the  dot  product  and  the  Cauchy  Schwarz  inequality, 
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<  \\u\\2  +  2  |w«  v|  +  ||v||2 

<  ||m||2  +  2||m||||v||  +  ||v||2  =  (||m||  +  ||v||)2 

Hence, 

\\u  +  v\\2  <  (||w||  +  ||v||)2 

Taking  square  roots  of  both  sides  you  obtain  4.9. 

It  remains  to  consider  when  equality  occurs.  Suppose  u  —  0.  Then,  u  —  Ov  and  the  claim  about  when 
equality  occurs  is  verified.  The  same  argument  holds  if  v  =  0.  Therefore,  it  can  be  assumed  both  vectors 
are  nonzero.  To  get  equality  in  4.9  above,  Theorem  4.29  implies  one  of  the  vectors  must  be  a  multiple  of 
the  other.  Say  v  =  ku.  If  k  <  0  then  equality  cannot  occur  in  4.9  because  in  this  case 

u*v  —  k\\u\\2  <  0  <  |k|  || n || 2  =  \u*v\ 


Therefore,  k  >  0. 

To  get  the  other  form  of  the  triangle  inequality  write 

u = u—v+v 
so 


\\U\\  —  \\u  —  v  +  v|| 

<  \\u  —  v||  +  ||v|| 

||a||  —  ||v||  <  \\u  —  v|| 

Ivll  —  Hull  <  llv-SII  =  ||S-v| 


Therefore, 

Similarly, 


(4.11) 


(4.12) 


It  follows  from  4.11  and  4.12  that  4.10  holds.  This  is  because  |  ||m||  —  ||v||  |  equals  the  left  side  of  either  4.11 
or  4.12  and  either  way,  |||m||  —  ||v|||  <  \\u  —  v||.  4 

4.7.2.  The  Geometric  Significance  of  the  Dot  Product 


Given  two  vectors,  u  and  v,  the  included  angle  is  the  angle  between  these  two  vectors  which  is  given  by 
0  such  that  0  <  0  <  n.  The  dot  product  can  be  used  to  determine  the  included  angle  between  two  vectors. 
Consider  the  following  picture  where  9  gives  the  included  angle. 
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In  words,  the  dot  product  of  two  vectors  equals  the  product  of  the  magnitude  (or  length)  of  the  two 
vectors  multiplied  by  the  cosine  of  the  included  angle.  Note  this  gives  a  geometric  description  of  the  dot 
product  which  does  not  depend  explicitly  on  the  coordinates  of  the  vectors. 

Consider  the  following  example. 


Solution.  By  Proposition  4.31, 
Hence, 


«•  v  =  \\u\\ llvll  cos  0 


cos  0  — 


u  •  v 


\U\\  V 


First,  we  can  compute  u»v.  By  Definition  4.25,  this  equals 

u»v  =  (2)(3)  +  (1)(4)  +  (— 1)(1)  =  9 

Then, 

Ml  =  x/(2)(2)  +  (l)(l)  +  (l)(l)  =  V6 

||v||  =  V(3)(3)  +  (4)(4)  +  (1)(1)=V26 
Therefore,  the  cosine  of  the  included  angle  equals 


cos  0  — 


V26V6 


=  0.7205766... 


With  the  cosine  known,  the  angle  can  be  determined  by  computing  the  inverse  cosine  of  that  angle, 
giving  approximately  0  =  0.76616  radians.  4k 

Another  application  of  the  geometric  description  of  the  dot  product  is  in  finding  the  angle  between  two 
lines.  Typically  one  would  assume  that  the  lines  intersect.  In  some  situations,  however,  it  may  make  sense 
to  ask  this  question  when  the  lines  do  not  intersect,  such  as  the  angle  between  two  object  trajectories.  In 
any  case  we  understand  it  to  mean  the  smallest  angle  between  (any  of)  their  direction  vectors.  The  only 
subtlety  here  is  that  if  U  is  a  direction  vector  for  a  line,  then  so  is  any  multiple  ku,  and  thus  we  will  find 
complementary  angles  among  all  angles  between  direction  vectors  for  two  lines,  and  we  simply  take  the 
smaller  of  the  two. 
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Solution.  You  can  verify  that  these  lines  do  not  intersect,  but  as  discussed  above  this  does  not  matter  and 
we  simply  find  the  smallest  angle  between  any  directions  vectors  for  these  lines. 

To  do  so  we  first  find  the  angle  between  the  direction  vectors  given  above: 


"  -1 ' 

2  ' 

u  — 

1 

,  v  = 

1 

2 

-1 

In  order  to  find  the  angle,  we  solve  the  following  equation  for  0 

u»v  —  ||m||  ||v||  cos  0 

to  obtain  cos  0  —  —  \  and  since  we  choose  included  angles  between  0  and  K  we  obtain  G  =  ip 

Now  the  angles  between  any  two  direction  vectors  for  these  lines  will  either  be  or  its  complement 
(j)  —  k  —  -  —  j.  We  choose  the  smaller  angle,  and  therefore  conclude  that  the  angle  between  the  two  lines 

is  f.  * 

We  can  also  use  Proposition  4.31  to  compute  the  dot  product  of  two  vectors. 


Solution.  From  the  geometric  description  of  the  dot  product  in  Proposition  4.31 

«•  v  —  (3)  (4)  cos(?r/3)  =  3x4xl/2  =  6 


4 

Two  nonzero  vectors  are  said  to  be  perpendicular,  sometimes  also  called  orthogonal,  if  the  included 
angle  is  7t/2  radians  (90°). 

Consider  the  following  proposition. 
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Proof.  This  follows  directly  from  Proposition  4.31.  First  if  the  dot  product  of  two  nonzero  vectors  is  equal 
to  0,  this  tells  us  that  cos  0=0  (this  is  where  we  need  nonzero  vectors).  Thus  0  —  k/ 2  and  the  vectors  are 
perpendicular. 

If  on  the  other  hand  v  is  perpendicular  to  u,  then  the  included  angle  is  n/2  radians.  Hence  cos  0=0 
and  u  •  v  =  0.  4 

Consider  the  following  example. 


Solution.  In  order  to  determine  if  these  two  vectors  are  perpendicular,  we  compute  the  dot  product.  This 
is  given  by 

S.v=(2)(l)  +  (l)(3)  +  (— 1)(5)  =  0 

Therefore,  by  Proposition  4.35  these  two  vectors  are  perpendicular.  4 

4.7.3.  Projections 


In  some  applications,  we  wish  to  write  a  vector  as  a  sum  of  two  related  vectors.  Through  the  concept 
of  projections,  we  can  find  these  two  vectors.  First,  we  explore  an  important  theorem.  The  result  of  this 
theorem  will  provide  our  definition  of  a  vector  projection. 


Proof.  Suppose  4.13  holds  and  vy  =  ku.  Taking  the  dot  product  of  both  sides  of  4.13  with  u  and  using 
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v_L  •  w  =  0,  this  yields 


V*W  =  (V||  +  V_|_)  *U 
=  ku  •  m  +  v_j_  •  u 
=  >t||w||2 


which  requires  &  =  v«m/||w||2.  Thus  there  can  be  no  more  than  one  vector  vj | .  It  follows  vj_  must  equal 
v  —  v 1 1 .  This  verifies  there  can  be  no  more  than  one  choice  for  both  vy  and  v_|_  and  proves  their  uniqueness. 


Now  let 


v«  u  _ 


and  let 


vu  _ 


Then  vy  —  ku  where  k  —  It  only  remains  to  verify  v±»u  —  0.  But 


v±_  •  u, 


_  _  v«  u  _  _ 
vu—  nlfU 
«r 


v»u  —  v»u 

0 


4 


The  vector  vy  in  Theorem  4.37  is  called  the  projection  of  v  onto  u  and  is  denoted  by 

V||  =  proj-  (v) 

We  now  make  a  formal  definition  of  the  vector  projection. 


Consider  the  following  example  of  a  projection. 
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Solution.  We  can  use  the  formula  provided  in  Definition  4.38  to  find  proj-  (v).  First,  compute  v  •  u.  This 
is  given  by 


1 ' 

2  ' 

-2 

• 

3 

1 

-4 

Similarly,  u  •  u  is  given  by 


1 

N> 

_ i 

1 

N> 

_ l 

3 

• 

3 

1 

1 

- 1 

1 

(2)(l)  +  (3)(-2)  +  (-4)(l) 

2-6-4 

-8 


(2)  (2)  +  (3)(3)  +  (— 4)(— 4) 

4  +  9+16 
29 


Therefore,  the  projection  is  equal  to 

2  ' 

3 
-4 

'  _16  ' 

29 

_  _ 24 

29 

32 

29 

* 

We  will  conclude  this  section  with  an  important  application  of  projections.  Suppose  a  line  L  and  a 
point  P  are  given  such  that  P  is  not  contained  in  L.  Through  the  use  of  projections,  we  can  determine  the 
shortest  distance  from  P  to  L. 


Pr°j(7(+)  =  “29 


1 

Example  4.40:  Shortest  Distance  from  a  Point  to  a  Line 

LetP  —  (1,3,5)  be  a  j 

direction  vector  d  = 

L  that  is  closest  to  P. 

oointin  M3,  and  letL  be  the  line  which  goes  through  point  Pq  =  (0,4,  —2)  with 
"  2  1 

1  .  Find  the  shortest  distance  from  P  to  the  line  L,  and  find  the  point  Q  on 

2 

Solution.  In  order  to  determine  the  shortest  distance  from  P  to  L,  we  will  first  find  the  vector  PqP  and  then 
find  the  projection  of  this  vector  onto  L.  The  vector  Ffaft  is  given  by 


"  1  ' 

0  ' 

1  ' 

3 

— 

4 

= 

-1 

5 

-2 

7 
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Then,  if  Q  is  the  point  on  L  closest  to  P,  it  follows  that 


proj/h^ 

V  ll^ll2  ) 


Now,  the  distance  from  P  to  L  is  given  by 

\0\\  =  \0-^\\  =  V26 


The  point  Q  is  found  by  adding  the  vector 


PqQ  to  the  position  vector  0 ~Pq  for  Pq  as  follows 


"  10  ' 

0  ' 
4 

5 

'  2  ' 
1 

— 

3 

17 

3 

-2 

2 

4 

3 

Therefore,  <2  =  (x>  T’ t )• 


* 


Exercises 


Exercise  4.7.1  Find 


'  1  ' 

"  2  ' 

2 

0 

3 

• 

1 

4 

3 

Exercise  4.7.2  Use  the  formula  given  in  Proposition  4.31  to  verify  the  Cauchy  Schwarz  inequality  and  to 
show  that  equality  occurs  if  and  only  if  one  of  the  vectors  is  a  scalar  multiple  of  the  other. 


Exercise  4.7.3  For  u,  v  vectors  in  M3,  define  the  product,  u*v  —  u\v\  +  2»2'’2  +  ^u^vy.  Show  the  axioms 
for  a  dot  product  all  hold  for  this  product.  Prove 

||5*v||  <  ( u*u )^2  (v*v)1/2 
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Exercise  4.7.4 
Exercise  4.7.5 


Let  a,b  be  vectors.  Show  that  \3»bj  =  \  (j|a  +  S||2  —  \\a  —  b\\2j  . 

Using  the  axioms  of  the  dot  product,  prove  the  parallelogram  identity: 
||a  +  Z>||2  +  \\a  —  fe|| 2  =  2||a||2  +  2||D||2 


Exercise  4.7.6  Let  Abe  a  real  m  x  n  matrix  and  let  u  €  R"  and  v  6  R"! .  Show  Au»v  —  u»Arv.  Hint:  Use 
the  definition  of  matrix  multiplication  to  do  this. 

Exercise  4.7.7  Use  the  result  of  Problem  4.7.6  to  verify  directly  that  ( AB)T  —  BTAT  without  making  any 
reference  to  subscripts. 


Exercise  4.7.8  Find  the  angle  between  the  vectors 


3  ' 

'  1  ' 

u  — 

-1 

,v  = 

4 

-1 

2 

Exercise  4.7.9  Find  the  angle  between  the  vectors 


1  ' 

1  ' 

u  = 

-2 

,v  = 

2 

1 

-7 

1  ' 

'  1  ' 

Exercise  4.7.10  Find  proj^(w)  wdiere  w  = 

0 

and  v  = 

2 

-2 

_  3  _ 

1  ' 

'  1  ' 

Exercise  4.7.11  Find  proj^(w)  where  w  — 

2 

and  v 

0 

-2 

_  3  _ 

1  ' 

'  1  ' 

Exercise  4.7.12  Find  proj^(  w)  where  w  = 

2 

-2 

ami  v  = 

2 

3 

1 

0 

Exercise  4.7.13  Let  P  =  (1,2,3)  be  a  point  in  R3.  Let  L  be  the  line  through  the  point  Pq  =  (1,4,5)  with 

1 

-1  .  Find  the  shortest  distance  from  P  to  L,  and  find  the  point  Q  on  L  that  is 

1 


direction  vector  d  — 


closest  to  P. 
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Exercise  4.7.14  Let  P  = 


direction  vector  d  = 
to  P. 


3 

0 

1 


0,2, 1)  be  a  point  in  M3.  Let  L  be  the  line  through  the  point  Pq  —  (1, 1, 1)  with 
.  Find  the  shortest  distance  from  P  to  L,  and  find  the  point  Q  on  L  that  is  closest 


Exercise  4.7.15  Does  it  make  sense  to  speak  ofproj g  (vv)? 

Exercise  4.7.16  Prove  the  Cauchy  Schwarz  inequality  in  M"  as  follows.  For  u,v  vectors,  consider 

(vv  —projyw)  •  (vv -proj-w)  >  0 

Simplify  using  the  axioms  of  the  dot  product  and  then  put  in  the  formula  for  the  projection.  Notice  that 
this  expression  equals  0  and  you  get  equality  in  the  Cauchy  Schwarz  inequality  if  and  only  ifw  =  projyw. 
What  is  the  geometric  meaning  ofw  —  projyw? 

Exercise  4.7.17  Let  v,w  u  be  vectors.  Show  that  (vv  +  u) ±  =  wj_  +  u±_  where  —  w  —  proj^(w) . 

Exercise  4.7.18  Show  that 

( v  - projn  ( v) , u)  =  ( v  - proju  (v))»U  =  0 

and  conclude  every  vector  in  Wl  can  be  written  as  the  sum  of  two  vectors,  one  which  is  perpendicular  and 
one  which  is  parallel  to  the  given  vector. 
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4.8  Planes  in  Wl 


Much  like  the  above  discussion  with  lines,  vectors  can  be  used  to  determine  planes  in  R".  Given  a 
vector  n  in  R”  and  a  point  Pq,  it  is  possible  to  find  a  unique  plane  which  contains  Pq  and  is  perpendicular 
to  the  given  vector. 


In  other  words,  we  say  that  n  is  orthogonal  (perpendicular)  to  every  vector  in  the  plane. 

Consider  now  a  plane  with  normal  vector  given  by  n,  and  containing  a  point  Pq.  Notice  that  this  plane 
is  unique.  If  P  is  an  arbitrary  point  on  this  plane,  then  by  definition  the  normal  vector  is  orthogonal  to  the 
vector  between  Pq  and  P.  Letting  oP  and  0/h  be  the  position  vectors  of  points  P  and  Pq  respectively,  it 
follows  that 

n»(OP-(yP0)  =0 
or 

n  •  Pq/j  =  0 

The  first  of  these  equations  gives  the  vector  equation  of  the  plane. 


Definition  4.42:  Vector  Equation  of  a  Plane 


Let  n  be  the  normal  vector  for  a  plane  which  contains  a  point  Pq.  If  P  is  an  arbitrary  point  on  this 
plane,  then  the  vector  equation  of  the  plane  is  given  by 

n»  (oP  —  OPq)  =  0 


Notice  that  this  equation  can  be  used  to  determine  if  a  point  P  is  contained  in  a  certain  plane. 


Example  4.43:  A  Point  in  a  Plane 


Letn  — 


1 

2 

3 


be  the  normal  vector  for  a  plane  which  contains  the  point  Pq  —  (2, 1,4).  Determine 


if  the  point  P  —  (5,4, 1)  is  contained  in  this  plane. 
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Solution.  By  Definition  4.42,  P  is  a  point  in  the  plane  if  it  satisfies  the  equation 

H*(0^-0^)  =  0 

Given  the  above  n,  Pq,  and  P,  this  equation  becomes 


'  1 ' 

( 

'  5  ' 

'  2  ' 

\ 

'  1  ' 

( 

3  ' 

\ 

2 

• 

4 

— 

1 

= 

2 

• 

3 

3 

V 

1 

4 

) 

3 

V 

-3 

/ 

=  3+6— 9=0 


Therefore  P  —  (5,4, 1)  is  contained  in  the  plane. 


Suppose  n  = 
Then 


a 

b 

c 


,  P  =  (x,y,z)  and  P0  =  (x0,yo+o)- 


n  i 


(0^-0^) 


a 

[ 

X 

Xo 

b 

• 

y 

— 

yo 

c 

V 

z 

.  zo  . 

a 

b 

• 

c 

x  —  xq 

y-yo 
z-zo 

a(x-x0)+b(y-y0)  +  c(z-Zo) 


0 

0 

0 

0 


We  can  also  write  this  equation  as 


4 


ax  +  by  +  cz  —  ax o  +  byo  +  czo 


Notice  that  since  Po  is  given,  ax o  +  by o  +  czo  is  a  known  scalar,  which  we  can  call  d.  This  equation 
becomes 


ax  +  by  +  cz  —  d 
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Consider  the  following  equation. 


Example  4.45:  Finding  the  Equation  of  a  Plane 

Find  an  equation  of  the  plane  containing  Pq  =  (3,  —2,5)  and  orthogonal  ton  — 

"  -2  " 
4 

1 

Solution.  The  above  vector  n  is  the  normal  vector  for  this  plane.  Using  Definition  4.42,  we  can  determine 
the  vector  equation  for  this  plane. 


x  —  3 
y  +  2 
z  —  5 


0 

0 

0 


Using  Definition  4.44,  we  can  determine  the  scalar  equation  of  the  plane. 


-2x  +  4y  +  1  z  =  -2(3)  +4(— 2)  +  1(5)  -  -9 


Hence,  the  vector  equation  of  the  plane  is 


"  -2  ' 

x  —  3 

4 

• 

y  T"  2 

1 

z  —  5 

and  the  scalar  equation  is 


— 2x  +  4y  +  1  z  —  —9 


* 


Suppose  a  point  P  is  not  contained  in  a  given  plane.  We  are  then  interested  in  the  shortest  distance 
from  that  point  P  to  the  given  plane.  Consider  the  following  example. 


Example  4.46:  Shortest  Distance  From  a  Point  to  a  Plane 


Find  the  shortest  distance  from  the  point  P  —  (3,2,3)  to  the  plane  given  by 
2x  +  y  +  2z  —  2,  and  find  the  point  Q  on  the  plane  that  is  closest  to  P. 


Solution.  Pick  an  arbitrary  point  Pq  on  the  plane.  Then,  it  follows  that 

?,PqP 


=  ProU' 0' 

and  ||  QP\\  is  the  shortest  distance  from  P  to  the  plane.  Further,  the  vector  0(3  =  oP  —  qP  gives  the  necessary 
point  Q. 
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From  the  above  scalar  equation,  we  have  that  n  — 


2  —  d.  Then,  PqP  — 

'  3  ' 

'  1  ' 

'  2  ' 

2 

— 

0 

= 

2 

3 

0 

3 

Next,  compute  QP  —  proj-/f)A 


2 

1 

2 


Now,  choose  Pq 


(1,0,0)  sothatH«Q/^  = 


proj  HPoP 


Then,  \\Qp\\=4  so  the  shortest  distance  from  P  to  the  plane  is  4. 
Next,  to  find  the  point  Q  on  the  plane  which  is  closest  to  P  we  have 


2 

1 

2 


Therefore,  Q  =  (5,5,5). 


* 


4.9  The  Cross  Product 


Recall  that  the  dot  product  is  one  of  two  important  products  for  vectors.  The  second  type  of  product 
for  vectors  is  called  the  cross  product.  It  is  important  to  note  that  the  cross  product  is  only  defined  in 
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M3.  First  we  discuss  the  geometric  meaning  and  then  a  description  in  terms  of  coordinates  is  given,  both 
of  which  are  important.  The  geometric  description  is  essential  in  order  to  understand  the  applications  to 
physics  and  geometry  while  the  coordinate  description  is  necessary  to  compute  the  cross  product. 

Consider  the  following  definition. 


Definition  4.47:  Right  Hand  System  of  Vectors 


Three  vectors,  u,  v,  w  form  a  right  hand  system  if  when  you  extend  the  fingers  of  your  right  hand 
along  the  direction  of  vector  u  and  close  them  in  the  direction  ofv,  the  thumb  points  roughly  in  the 
direction  of  w. 


For  an  example  of  a  right  handed  system  of  vectors,  see  the  following  picture. 


w 


In  this  picture  the  vector  w  points  upwards  from  the  plane  determined  by  the  other  two  vectors.  Point 
the  fingers  of  your  right  hand  along  u,  and  close  them  in  the  direction  of  v.  Notice  that  if  you  extend  the 
thumb  on  your  right  hand,  it  points  in  the  direction  of  w. 

You  should  consider  how  a  right  hand  system  would  differ  from  a  left  hand  system.  Try  using  your  left 
hand  and  you  will  see  that  the  vector  w  would  need  to  point  in  the  opposite  direction. 

Notice  that  the  special  vectors,  i,j,k  will  always  form  a  right  handed  system.  If  you  extend  the  fingers 
of  your  right  hand  along  i  and  close  them  in  the  direction  j,  the  thumb  points  in  the  direction  of  k. 


% 


j 


The  following  is  the  geometric  description  of  the  cross  product.  Recall  that  the  dot  product  of  two 
vectors  results  in  a  scalar.  In  contrast,  the  cross  product  results  in  a  vector,  as  the  product  gives  a  direction 
as  well  as  magnitude. 
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The  cross  product  of  the  special  vectors  i,  j,k  is  as  follows. 

ix  j  —  k  j  xi  —  —k 
kx i  —  j  ixk  —  —  j 
j  xk  —  i  kx  j  —  —i 

With  this  information,  the  following  gives  the  coordinate  description  of  the  cross  product. 

Recall  that  the  vector  u  =  [  u\  m  113  ] T  can  be  written  in  terms  of  i,j,k  as  u  =  u\i  +  ZZ2J  +  u^k. 


Writing  u  x  v  in  the  usual  way,  it  is  given  by 


u  x  v  — 


uiv 3  -  M3V2 
-(U1V3-U3V 1) 
U\V2-U2V\ 


We  now  prove  this  proposition. 

Proof.  From  the  above  table  and  the  properties  of  the  cross  product  listed, 


U  X  V 


(^u\i  + 112]  +  u^k^j  x  (vii  +  v2j  +  v2ksj 

u\v2i  x  j  +  M1V3Z  x  k  +  u2v\j  x  i  +  u2v2j  x  k  +  +u2vik  x  i  +  u2v2k  x  j 
u  1  v2k  —  u\  v’3  /  —  u2v\k  +  u2v2i  -\-u3V1j  —  u2v2i 
(u2V3  -u3v2)i+  (M3V1  -miv3) /+  (u\v2  —  u2v\)k 


(4.15) 

4 


4.9.  The  Cross  Product  185 


There  is  another  version  of  4.14  which  may  be  easier  to  remember.  We  can  express  the  cross  product 
as  the  determinant  of  a  matrix,  as  follows. 


u  x  v  = 


i  j  k 
U  i  ll2  W3 
Vl  V2  V3 


Expanding  the  determinant  along  the  top  row  yields 


*(-l) 


1+1 


U2  U  3 
V2  V3 


+/(-!) 


2+1 


=  l 


U2  U3 
V2  V3 

Expanding  these  determinants  leads  to 


u  1  a3 
vi  v3 


+  7(-l) 


3+1 


U 1  W3 
Vl  v3 


+  k 


U 1  U2 
Vl  v2 


U 1  U-2 

Vl  V2 


(m2v3  -U3v2)i-  («1V3  -M3vi)  J+  (uiv2-u2vi)k 


(4.16) 


which  is  the  same  as  4.15. 

The  cross  product  satisfies  the  following  properties. 


Proof.  Formula  1.  follows  immediately  from  the  definition.  The  vectors  uxv  and  v  x  u  have  the  same 
magnitude,  \u\  |v|  sin  0,  and  an  application  of  the  right  hand  rule  shows  they  have  opposite  direction. 

Formula  2.  is  proven  as  follows.  If  k  is  a  non-negative  scalar,  the  direction  of  ( ku )  x  v  is  the  same  as 
the  direction  of  u  x  v,k  (u  x  v)  and  u  x  (kv).  The  magnitude  is  k  times  the  magnitude  of  uxv  which  is  the 
same  as  the  magnitude  of  k  (u  x  v)  and  u  x  (kv) .  Using  this  yields  equality  in  2.  In  the  case  where  k  <  0, 
everything  works  the  same  way  except  the  vectors  are  all  pointing  in  the  opposite  direction  and  you  must 
multiply  by  |k|  when  comparing  their  magnitudes. 

The  distributive  laws,  3.  and  4.,  are  much  harder  to  establish.  For  now,  it  suffices  to  notice  that  if  we 
know  that  3.  is  true,  4.  follows.  Thus,  assuming  3.,  and  using  1., 

(v  +  vv)  XU  —  — u  x  (v  +  w) 

—  —  (u  x  v  +  uxw) 

—  VXU  +  W  XU 


4 
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We  will  now  look  at  an  example  of  how  to  compute  a  cross  product. 


Example  4.51:  Find  a  Cross  Product 


Find  uxv  for  the  following  vectors 

1  ' 

3  ' 

u  — 

-1 

,v  = 

-2 

2 

1 

Solution.  Note  that  we  can  write  U,v  in  terms  of  the  special  vectors  i,j,k  as 

u  —  i  —  j  +  2k 
v  =  3i  —  2j  +  k 


We  will  use  the  equation  given  by  4.16  to  compute  the  cross  product. 


T  j  k 

-1  2 

1  2 

i  -i 

1  -1  2 

3  -2  1 

-2  1 

i  — 

3  1 

7+ 

3  -2 

k  =  3i  +  5j  +  k 


We  can  write  this  result  in  the  usual  way,  as 

uxv  — 


3 

5 

1 


* 


An  important  geometrical  application  of  the  cross  product  is  as  follows.  The  size  of  the  cross  product, 
u  x  v||,  is  the  area  of  the  parallelogram  determined  by  u  and  v,  as  shown  in  the  following  picture. 


We  examine  this  concept  in  the  following  example. 
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Solution.  Notice  that  these  vectors  are  the  same  as  the  ones  given  in  Example  4.51.  Recall  from  the 
geometric  description  of  the  cross  product,  that  the  area  of  the  parallelogram  is  simply  the  magnitude  of 
u  x  v.  From  Example  4.5 1,  u  x  v  =  3/  +  5j  +  k.  We  can  also  write  this  as 


U  X  V  — 


3 

5 

1 


Thus  the  area  of  the  parallelogram  is 

\\u  x  v||  =  V(3)(3)  +  (5)(5)  +  (1)(1)  =  V9  +  25  +  1  =  V35 


* 


We  can  also  use  this  concept  to  find  the  area  of  a  triangle.  Consider  the  following  example. 


Example  4.53:  Area  of  Triangle 


Find  the  area  of  the  triangle  determined  by  the  points  ( 1 , 2, 3) ,  (0, 2, 5) ,  (5, 1 , 2) 


Solution.  This  triangle  is  obtained  by  connecting  the  three  points  with  lines.  Picking  (1,2,3)  as  a  starting 
point,  there  are  two  displacement  vectors,  [  —  1  0  2  ]  and  [4  —1  —  1  ]  .  Notice  that  if  we  add 
either  of  these  vectors  to  the  position  vector  of  the  starting  point,  the  result  is  the  position  vectors  of 
the  other  two  points.  Now,  the  area  of  the  triangle  is  half  the  area  of  the  parallelogram  determined  by 
[  —  1  0  2  ]  and  [4  —1  —  1  ]  .  The  required  cross  product  is  given  by 


"  -1  ' 

4  ' 

0 

X 

-1 

2 

-1 

Taking  the  size  of  this  vector  gives  the  area  of  the  parallelogram,  given  by 

V(2)(2)  +  (7)(7)  +  (1)(1)  =  V4  +  49+  1  =  v/54 
Hence  the  area  of  the  triangle  is  ^\/54  =  \\fb. 


* 


In  general,  if  you  have  three  points  in  M3 ,  F.  Q,R,  the  area  of  the  triangle  is  given  by 


Recall  that  P (5  is  the  vector  running 


om  puim 


Q 


R 


In  the  next  section,  we  explore  another  application  of  the  cross  product. 
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4.9.1.  The  Box  Product 


Recall  that  we  can  use  the  cross  product  to  find  the  the  area  of  a  parallelogram.  It  follows  that  we  can  use 
the  cross  product  together  with  the  dot  product  to  find  the  volume  of  a  parallelepiped. 

We  begin  with  a  definition. 


Definition  4.54:  Parallelepiped 


A  parallelepiped  determined  by  the  three  vectors,  u,v,  and  w  consists  of 

{ru  +  sv  +  tw  :  r,s,t  e  [0, 1]} 

That  is,  if  you  pick  three  numbers,  r,  s,  and  t  each  in  [0, 1]  and  form  ru  +  sv+  tw  then  the  collection 
of  all  such  points  makes  up  the  parallelepiped  determined  by  these  three  vectors. 


The  following  is  an  example  of  a  parallelepiped. 


Notice  that  the  base  of  the  parallelepiped  is  the  parallelogram  determined  by  the  vectors  u  and  v. 
Therefore,  its  area  is  equal  to  \\u  x  v||.  The  height  of  the  parallelepiped  is  ||iv||  cos  6  where  0  is  the  angle 
shown  in  the  picture  between  w  and  Uxv.  The  volume  of  this  parallelepiped  is  the  area  of  the  base  times 
the  height  which  is  just 

II M  X  v||  || iv||  COS0  —  (uxv)»w 

This  expression  is  known  as  the  box  product  and  is  sometimes  written  as  [u,  v,  w] .  You  should  consider 
what  happens  if  you  interchange  the  v  with  the  w  or  the  u  with  the  w.  You  can  see  geometrically  from 
drawing  pictures  that  this  merely  introduces  a  minus  sign.  In  any  case  the  box  product  of  three  vectors 
always  equals  either  the  volume  of  the  parallelepiped  determined  by  the  three  vectors  or  else  —  1  times  this 
volume. 


Proposition  4.55:  The  Box  Product 


Let  u,v,w  be  three  vectors  in  M"  that  define  a  parallelepiped.  Then  the  volume  of  the  parallelepiped 
is  the  absolute  value  of  the  box  product,  given  by 

\(u  x  v)«w| 


Consider  an  example  of  this  concept. 
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Solution.  According  to  the  above  discussion,  pick  any  two  of  these  vectors,  take  the  cross  product  and  then 
take  the  dot  product  of  this  with  the  third  of  these  vectors.  The  result  will  be  either  the  desired  volume  or 
—  1  times  the  desired  volume.  Therefore  by  taking  the  absolute  value  of  the  result,  we  obtain  the  volume. 

We  will  take  the  cross  product  of  u  and  v.  This  is  given  by 


1 ' 

1  ' 

U  XV  — 

2 

X 

3 

-5 

-6 

i  j  k 

—  3i  +  j  k  — 

'  3  ' 

1  2  -5 

1 

1  3  -6 

1 

Now  take  the  dot  product  of  this  vector  with  w  which  yields 


"  3  ' 

"  3  ' 

1 

• 

2 

1 

3 

—  ^3 i  +  /  +  k'j  •  ^3 i  +  2 j  +  3 kj 
=  9+2+3 
=  14 


This  shows  the  volume  of  this  parallelepiped  is  14  cubic  units.  4k 

There  is  a  fundamental  observation  which  comes  directly  from  the  geometric  definitions  of  the  cross 
product  and  the  dot  product. 


Proof.  This  follows  from  observing  that  either  {u  x  v)  •  w  and  u  •  (v  x  w)  both  give  the  volume  of  the 
parallelepiped  or  they  both  give  —  1  times  the  volume.  4k 


Recall  that  we  can  express  the  cross  product  as  the  determinant  of  a  particular  matrix.  It  turns  out 
that  the  same  can  be  done  for  the  box  product.  Suppose  you  have  three  vectors,  u  —  [  a  b  c  ]T  ,v  = 

T  T 

[  d  e  f  ]  ,  and  w  —  [  g  h  i  ]  .  Then  the  box  product  u  •  (v  x  w)  is  given  by  the  following. 


M«(VXW)  = 

a 

b 

• 

i  j  k 
d  e  f 

c 

g  h  i 

u  •  (v  X  w) 
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—  a 


e  f 
h  i 


d  f 
8  i 


+  c 


d 

8 


e 

h 


—  det 


a  b  c 
d  e  f 
8  h  i 


To  take  the  box  product,  you  can  simply  take  the  determinant  of  the  matrix  which  results  by  letting  the 
rows  be  the  components  of  the  given  vectors  in  the  order  in  which  they  occur  in  the  box  product. 

This  follows  directly  from  the  definition  of  the  cross  product  given  above  and  the  way  we  expand 
determinants.  Thus  the  volume  of  a  parallelepiped  determined  by  the  vectors  u,v,w  is  just  the  absolute 
value  of  the  above  determinant. 


Exercises 


Exercise  4.9.1  Show  diat  if  a  x  if  =  0  for  any  unit  vector  if,  then  a  —  0. 

Exercise  4.9.2  Find  the  area  of  the  triangle  determined  by  the  three  points,  (1,2, 3) ,  (4,2,0)  and  (—3,2, 1) . 


Exercise  4.9.3  Find  the  area  of  the  triangle  determined  by  the  three  points,  ( 1 , 0, 3) ,  (4, 1 , 0)  and  (—3,1,1). 


Exercise  4.9.4  Find  the  area  of  the  triangle  determined  by  the  three  points,  (1,2,3) ,  (2,3,4)  and  (3,4,5) . 
Did  something  interesting  happen  here?  What  does  it  mean  geometrically? 


'  1  ' 

3  ' 

Exercise  4.9.5  Find  the  area  of  the  parallelogram  determined  by  the  vectors 

2 

y 

-2 

_  3  _ 

1 

'  1  ' 

4  ' 

Exercise  4.9.6  Find  the  area  of  the  parallelogram  determined  by  the  vectors 

0 

y 

-2 

3 

1 

Exercise  4.9.7  Is  U  x  (v  x  vv)  =  (if  x  v)  x  w?  What  is  the  meaning  of  U  x  v  x  wl  Explain.  Hint:  Try 
(j  x  ./)  x  k. 

Exercise  4.9.8  Verify  directly  that  the  coordinate  description  of  the  cross  product,  uxv  has  the  property 
that  it  is  perpendicular  to  both  u  and  v.  Then  show  by  direct  computation  that  this  coordinate  description 
satisfies 

|| if  x  v||2  =  ||if||2||v||2  —  (if  •  v)2 

=  ll^l|2|l^l|2  (l  —cos2  (0)) 
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where  G  is  the  angle  included  between  the  two  vectors.  Explain  why  \\u  x  v||  has  the  correct  magnitude. 


Exercise  4.9.9  Suppose  A  is  a  3x3  skew  symmetric  matrix  such  that  A7  —  —A.  Show  there  exists  a  vector 
Ci  such  that  for  all  «6l3 

Au  =  Q  x  u 

Hint:  Explain  why,  since  A  is  skew  symmetric  it  is  of  the  form 


A  = 


0  -Oh,  Oh 
Oh,  0  -ft)] 

-Oh  (0\  0 


where  the  ft),-  are  numbers.  Then  consider  <X)\i  +  Qhj  +  Opk. 


Exercise  4.9.10  Find  the  volume  of  the  parallelepiped  determined  by  the  vectors 


1 ' 

'  3  ' 

-2 

,  and 

2 

-6 

3 

1 

-7 

-5 


Exercise  4.9.11  Suppose  u,v,  and  w  are  three  vectors  whose  components  are  all  integers.  Can  you  con¬ 
clude  the  volume  of  the  parallelepiped  determined  from  these  three  vectors  will  always  be  an  integer? 


Exercise  4.9.12  What  does  it  mean  geometrically  if  the  box  product  of  three  vectors  gives  zero? 

Exercise  4.9.13  Using  Problem  4.9.12,  find  an  equation  of  a  plane  containing  the  two  position  vectors,  p 
and  q  and  the  point  0.  Hint:  If  (x,  v,  z)  is  a  point  on  this  plane,  the  volume  of  the  parallelepiped  determined 
by  (x,y,z)  and  the  vectors  p,q  equals  0. 


Exercise  4.9.14  Using  the  notion  of  the  box  product  yielding  either  plus  or  minus  the  volume  of  the 
parallelepiped  determined  by  the  given  three  vectors,  show >  that 

(«xv>w=«*(vxw) 


In  other  words,  the  dot  and  the  cross  can  be  switched  as  long  as  the  order  of  the  vectors  remains  the  same. 
Hint:  There  are  two  ways  to  do  this,  by  the  coordinate  description  of  the  dot  and  cross  product  and  by 
geometric  reasoning. 


Exercise  4.9.15 
Exercise  4.9.16 
Exercise  4.9.17 


Simplify  (u  x  v)  •  (v  x  w)  x  {w  x  z)  ■ 

Simplify  \\ux  v||2  +  (u»v)2  —  ||m||2||v||2. 

For  u,v,w  functions  of  t,  prove  the  following  product  rules: 
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4.10  Spanning,  Linear  Independence  and  Basis  in  W1 


Outcomes 


A.  Determine  the  span  of  a  set  of  vectors,  and  determine  if  a  vector  is  contained  in  a  specified 
span. 

B.  Determine  if  a  set  of  vectors  is  linearly  independent. 

C.  Understand  the  concepts  of  subspace,  basis,  and  dimension. 

D.  Find  the  row  space,  column  space,  and  null  space  of  a  matrix. 


By  generating  all  linear  combinations  of  a  set  of  vectors  one  can  obtain  various  subsets  of  Rn  which 
we  call  subspaces.  For  example  what  set  of  vectors  in  R3  generate  the  XT-plane?  What  is  the  smallest 
such  set  of  vectors  can  you  find?  The  tools  of  spanning,  linear  independence  and  basis  are  exactly  what  is 
needed  to  answer  these  and  similar  questions  and  are  the  focus  of  this  section.  The  following  definition  is 
essential. 


4.10.1.  Spanning  Set  of  Vectors 


We  begin  this  section  with  a  definition. 


Definition  4.59:  Span  of  a  Set  of  Vectors 


The  collection  of  all  linear  combinations  of  a  set  of  vectors  {«],-•• ,  iff. }  in  R"  is  known  as  the  span 
of  these  vectors  and  is  written  as  span{U\  ,  ■  ■  ■ ,  u^}. 


Consider  the  following  example. 


Solution.  You  can  see  that  any  linear  combination  of  the  vectors  u  and  v  yields  a  vector  of  the  form 
[  x  y  0  Y  in  the  XT-plane. 
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Moreover  every  vector  in  the  XY -plane  is  in  fact  such  a  linear  combination  of  the  vectors  u  and  v. 
That’s  because 


X 

"  1  ' 

'  3  ' 

y 

=  (-2x  +  3  y) 

1 

+  (x-y) 

2 

0 

0 

0 

Thus  span{S,  v}  is  precisely  the  XY-plane.  4 

You  can  convince  yourself  that  no  single  vector  can  span  the  XT-plane.  In  fact,  take  a  moment  to 
consider  what  is  meant  by  the  span  of  a  single  vector. 

However  you  can  make  the  set  larger  if  you  wish.  For  example  consider  the  larger  set  of  vectors 
{u,v,w}  where  w  =  [  4  5  0  ] T .  Since  the  first  two  vectors  already  span  the  entire  XT-plane,  the  span 
is  once  again  precisely  the  Xy-plane  and  nothing  has  been  gained.  Of  course  if  you  add  a  new  vector  such 
as  w  —  [  0  0  1  ]  then  it  does  span  a  different  space.  What  is  the  span  of  u,  v,w  in  this  case? 

The  distinction  between  the  sets  {w,v}  and  {u,v,w}  will  be  made  using  the  concept  of  linear  indepen¬ 
dence. 

Consider  the  vectors  u,v,  and  w  discussed  above.  In  the  next  example,  we  will  show  how  to  formally 
demonstrate  that  w  is  in  the  span  of  u  and  v. 


Solution.  For  a  vector  to  be  in  span{w,v},  it  must  be  a  linear  combination  of  these  vectors.  If  w  E 
span{S,  v},  we  must  be  able  to  find  scalars  a,b  such  that 

w  =  au  +  bv 


We  proceed  as  follows. 


"  4  ' 

'  1  ' 

'  3  ' 

5 

=  a 

1 

+  b 

2 

0 

0 

0 

This  is  equivalent  to  the  following  system  of  equations 

a  +  3b  —  4 
a  +  2b  —  5 


We  solving  this  system  the  usual  way,  constructing  the  augmented  matrix  and  row  reducing  to  find  the 
reduced  row-echelon  form. 


1  3  4 
1  2  5 


->■ - > 


1  0  7 

0  1  -1 


The  solution  is  a  —  l,b  —  —  1.  This  means  that 


w  —  lu  —  v 


Therefore  we  can  say  that  w  is  in  span  {u,v}. 


4 
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4.10.2.  Linearly  Independent  Set  of  Vectors 


We  now  turn  our  attention  to  the  following  question:  what  linear  combinations  of  a  given  set  of  vectors 

{mi,-  •  •  ,Uk}  in  R”  yields  the  zero  vector?  Clearly  0U\  +  OU2  H - b  0 i4  =  0,  but  is  it  possible  to  have 

T.i=i  aiUi  —  0  without  all  coefficients  being  zero? 

You  can  create  examples  where  this  easily  happens.  For  example  if  u\  —  U2,  then  1  u\  —  U2  +  OU3  + 
- b  0 Uk  —  0,  no  matter  the  vectors  {M3,  •  •  • ,  Uf,}.  OBut  sometimes  it  can  be  more  subtle. 


You  can  see  that  the  linear  combination  does  yield  the  zero  vector  but  has  some  non-zero  coefficients. 
Thus  we  define  a  set  of  vectors  to  be  linearly  dependent  if  this  happens. 


Definition  4.63:  Linearly  Dependent  Set  of  Vectors 


A  set  of  non-zero  vectors  {u\,  ■  ■  ■ ,  fp}  in  R'!  is  said  to  be  linearly  dependent  if  a  linear  combination 
of  these  vectors  without  all  coefficients  being  zero  does  yield  the  zero  vector. 


Note  that  if  Yp=  1  rw  =  0  and  some  coefficient  is  non-zero,  say  a  1  yb  0,  then 


k 

Y 

i=  2 


and  thus  u\  is  in  the  span  of  the  other  vectors.  And  the  converse  clearly  works  as  well,  so  we  get  that  a  set 
of  vectors  is  linearly  dependent  precisely  when  one  of  its  vector  is  in  the  span  of  the  other  vectors  of  that 
set. 

In  particular,  you  can  show  that  the  vector  u\  in  the  above  example  is  in  the  span  of  the  vectors 

{u.2,  M3,  M4}. 

If  a  set  of  vectors  is  NOT  linearly  dependent,  then  it  must  be  that  any  linear  combination  of  these 
vectors  which  yields  the  zero  vector  must  use  all  zero  coefficients.  This  is  a  very  important  notion,  and  we 
give  it  its  own  name  of  linear  independence. 
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Note  also  that  we  require  all  vectors  to  be  non-zero  to  form  a  linearly  independent  set. 

To  view  this  in  a  more  familiar  setting,  form  the  nxk  matrix  A  having  these  vectors  as  columns.  Then 
all  we  are  saying  is  that  the  set  {U\,-  ■  ■  ,i4}  is  linearly  independent  precisely  when  AX  =  0  has  only  the 
trivial  solution. 

Here  is  an  example. 


Solution.  So  suppose  that  we  have  a  linear  combinations  au  +  bv  +  cw  —  0.  Then  you  can  see  that  this  can 
only  happen  with  a  =  b  —  c  —  0. 


As  mentioned  above,  you  can  equivalently  form  the  3x3  matrix  A 
AX  =  0  has  only  the  trivial  solution. 


1  1  0 
1  0  1 
0  1  1 


and  show  that 


Thus  this  means  the  set  {u,v,w}  is  linearly  independent. 


* 


In  terms  of  spanning,  a  set  of  vectors  is  linearly  independent  if  it  does  not  contain  unnecessary  vectors, 
that  is  not  vector  is  in  the  span  of  the  others. 

Thus  we  put  all  this  together  in  the  following  important  theorem. 
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The  last  sentence  of  this  theorem  is  useful  as  it  allows  us  to  use  the  reduced  row-echelon  form  of  a 
matrix  to  determine  if  a  set  of  vectors  is  linearly  independent.  Let  the  vectors  be  columns  of  a  matrix  A. 
Find  the  reduced  row-echelon  form  of  A.  If  each  column  has  a  leading  one,  then  it  follows  that  the  vectors 
are  linearly  independent. 

Sometimes  we  refer  to  the  condition  regarding  sums  as  follows:  The  set  of  vectors,  {U\,- ■  ■  ,Uk}  is 
linearly  independent  if  and  only  if  there  is  no  nontrivial  linear  combination  which  equals  the  zero  vector. 
A  nontrivial  linear  combination  is  one  in  which  not  all  the  scalars  equal  zero.  Similarly,  a  trivial  linear 
combination  is  one  in  which  all  scalars  equal  zero. 

Here  is  a  detailed  example  in  M4. 


Example  4.67:  Linear  Independence 


Determine  whether  the  set  of  vectors  given  by 


r 

i 

2 

0 

3 

2 

1 

1 

2 

3 

9 

0 

9 

1 

9 

2 

i 

0 

1 

2 

0 

is  linearly  independent.  If  it  is  linearly  dependent,  express  one  of  the  vectors  as  a  linear  combination 
of  the  others. 


Solution.  In  this  case  the  matrix  of  the  corresponding  homogeneous  system  of  linear  equations  is 


'  1 

2 

0 

3 

0  ' 

2 

1 

1 

2 

0 

3 

0 

1 

2 

0 

0 

1 

2 

0 

0 
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The  reduced  row-echelon  form  is 


'  1 

0 

0 

0 

0  ' 

0 

1 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

1 

0 

and  so  every  column  is  a  pivot  column  and  the  corresponding  system  AX  =  0  only  has  the  trivial  solution. 
Therefore,  these  vectors  are  linearly  independent  and  there  is  no  way  to  obtain  one  of  the  vectors  as  a 
linear  combination  of  the  others.  4k 

Consider  another  example. 


Example  4.68:  Linear  Independence 


Determine  whether  the  set  of  vectors  given  by 


1 

2 

3 

0 


2 

1 

0 

1 


0 

1 

1 

2 


3 

2 

2 

-1 


is  linearly  independent.  If  it  is  linearly  dependent,  express  one  of  the  vectors  as  a  linear  combination 
of  the  others. 


Solution.  Form  the  4x4  matrix  A  having  these  vectors  as  columns: 

'  1  2  0  3  " 

2  i  1  2 

3  0  1  2 

0  12-1 

Then  by  Theorem  4.66,  the  given  set  of  vectors  is  linearly  independent  exactly  if  the  system  AX  —  0  has 
only  the  trivial  solution. 

The  augmented  matrix  for  this  system  and  corresponding  reduced  row-echelon  form  are  given  by 


'  1 

2 

0 

3 

0  ' 

"  1 

0 

0 

1 

0  ' 

2 

1 

1 

2 

0 

0 

1 

0 

1 

0 

3 

0 

1 

2 

0 

— >•  • 

•  ->■ 

0 

0 

1 

-1 

0 

0 

1 

2 

-1 

0 

0 

0 

0 

0 

0 

Not  all  the  columns  of  the  coefficient  matrix  are  pivot  columns  and  so  the  vectors  are  not  linearly  inde¬ 
pendent.  In  this  case,  we  say  the  vectors  are  linearly  dependent. 

It  follows  that  there  are  infinitely  many  solutions  to  AX  =  0,  one  of  which  is 
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Therefore  we  can  write 


1 

2 

0 

3 

0 

2 

+  1 

1 

0 

-  1 

1 

-1 

2 

0 

3 

1 

2 

0 

0 

1 

2 

-1 

0 

This  can  be  rearranged  as  follows 


1 

2 

0 

3 

2 

+  1 

1 

-1 

1 

2 

3 

0 

1 

— 

2 

0 

1 

2 

-1 

This  gives  the  last  vector  as  a  linear  combination  of  the  first  three  vectors. 

Notice  that  we  could  rearrange  this  equation  to  write  any  of  the  four  vectors  as  a  linear  combination  of 
the  other  three.  4k 


When  given  a  linearly  independent  set  of  vectors,  we  can  determine  if  related  sets  are  linearly  inde¬ 
pendent. 


Example  4.69:  Related  Sets  of  Vectors 


Let{u.,v,w}  be  an  independent  set  of  R".  Is  {u  +  v,2u  +  w,v  —  5w}  linearly  independent? 


Solution.  Suppose  a(u  +  v)  +  b(2u  +  w)  +c(v  —  5w)  =  0„  for  some  a,b,c  G  R.  Then 

(a  +  2b)U+  (a  +  c)v  +  (b  —  5c)vv  =  0,7. 

Since  {u,v,w}  is  independent, 

a  +  2b  —  0 
a  +  c  —  0 
b-5c  =  0 

This  system  of  three  equations  in  three  variables  has  the  unique  solution  a  —  b  —  c  —  0.  Therefore, 
{u  +  v,  2u  +  w,  v  —  5 w}  is  independent.  4 

The  following  corollary  follows  from  the  fact  that  if  the  augmented  matrix  of  a  homogeneous  system 
of  linear  equations  has  more  columns  than  rows,  the  system  has  infinitely  many  solutions. 


Corollary  4.70:  Linear  Dependence  in  R'7 


Let  {Si,- •  •  ,Uk}  be  a  set  of  vectors  in  M”.  If  k  >  n,  then  the  set  is  linearly  dependent  (i.e.  NOT 
linearly  independent). 


Proof.  Form  the  nxk  matrix  A  having  the  vectors  {u\ ,  •  •  • ,  i4}  as  its  columns  and  suppose  k>  n.  Then  A 
has  rank  r  <  n  <  k,  so  the  system  AX  =  0  has  a  nontrivial  solution  and  thus  not  linearly  independent  by 
Theorem  4.66.  4 
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Solution.  This  set  contains  three  vectors  in  R2.  By  Corollary  4.70  these  vectors  are  linearly  dependent.  In 
fact,  we  can  write 

(-1) 

showing  that  this  set  is  linearly  dependent. 

The  third  vector  in  the  previous  example  is  in  the  span  of  the  first  two  vectors.  We  could  find  a  way  to 
write  this  vector  as  a  linear  combination  of  the  other  two  vectors.  It  turns  out  that  the  linear  combination 
which  we  found  is  the  only  one,  provided  that  the  set  is  linearly  independent. 


1 

4 


+  (2) 


2 

3 


3 

2 


Theorem  4.72:  Unique  Linear  Combination 


LetU  C  R”  be  an  independent  set.  Then  any  vector  x  G  span(U)  can  be  written  uniquely  as  a  linear 
combination  of  vectors  ofU. 


Proof.  To  prove  this  theorem,  we  will  show  that  two  linear  combinations  of  vectors  in  U  that  equal  x  must 
be  the  same.  Let  U  =  {U\,  U2, . . . ,  uk}.  Suppose  that  there  is  a  vector  x  G  span(U)  such  that 

x  —  s\U\+S2U2-\ - \-skuk,  for  some  51,53, •  •  •  ,sk  G  R,  and 

x  —  t\U\  +t2U2~\ - b  tkuk,  for  some  t\ , t2,  ■  ■  ■ , tk  G  R. 

Then  0„  =  x  —  x.  =  (s  1  — 1\ ) u\  +  (S2  —  ^2) ui  H - b  (sk  —  tk) uk- 

Since  U  is  independent,  the  only  linear  combination  that  vanishes  is  the  trivial  one,  so  .v,  —  q  =  0  for 
all  i,  1  <  i  <  k. 

Therefore,  Si  —  tj  for  all  i,  1  <  i  <  k,  and  the  representation  is  unique.Let  U  C  R"  be  an  independent 
set.  Then  any  vector  x  G  span(U)  can  be  written  uniquely  as  a  linear  combination  of  vectors  of  U.  4 

Suppose  that  u,v  and  w  are  nonzero  vectors  in  M3,  and  that  {v,w}  is  independent.  Consider  the  set 
{u,v,w}.  When  can  we  know  that  this  set  is  independent?  It  turns  out  that  this  follows  exactly  when 
u  ^  span{v,iv}. 


Example  4.73: 


Suppose  that  u,  v  and  w  are  nonzero  vectors  in  R3,  and  that{v,  w}  is  independent.  Prove  that  { u,  v,  w} 
is  independent  if  and  only  if  u  ^  span{v,w}. 


Solution.  If  u  G  span{v,  vv},  then  there  exist  a,  be  R  so  that  u  =  av  +  bw.  This  implies  that  U  —  av  —  bw  =  O3, 
son  —  civ  —  bw  is  a  nontrivial  linear  combination  of  { n,  y,  w }  that  vanishes,  and  thus  { n,  v,  w }  is  dependent. 
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Now  suppose  that  u  0  span{v,  w},  and  suppose  that  there  exist  a,b,c  G  M.  such  that  au  +  bv  +  cw  —  O3.  If 
a  ^  0,  then  u  =  —  |v—  /vv,  and  a  G  span { v,  vv } ,  a  contradiction.  Therefore,  a  =  0.  implying  that  bv  +  cw  = 
O3.  Since  {v,iv}  is  independent,  b  =  c  =  0,  and  thus  a  —  b  —  c  —  0,  i.e.,  the  only  linear  combination  of  u,v 
and  w  that  vanishes  is  the  trivial  one. 

Therefore,  {u,v,w}  is  independent.  4 

Consider  the  following  useful  theorem. 


Theorem  4.74:  Invertible  Matrices 


Let  A  be  an  invertible  n  x  n  matrix.  Then  the  columns  of  A  are  independent  and  span  M”.  Similarly, 
the  rows  of  A  are  independent  and  span  the  set  of  all  1  x  n  vectors. 


This  theorem  also  allows  us  to  determine  if  a  matrix  is  invertible.  If  an  n  x  n  matrix  A  has  columns 
which  are  independent,  or  span  K",  then  it  follows  that  A  is  invertible.  If  it  has  rows  that  are  independent, 
or  span  the  set  of  all  1  x  n  vectors,  then  A  is  invertible. 

4.10.3.  A  Short  Application  to  Chemistry 


The  following  section  applies  the  concepts  of  spanning  and  linear  independence  to  the  subject  of  chemistry. 

When  working  with  chemical  reactions,  there  are  sometimes  a  large  number  of  reactions  and  some  are 
in  a  sense  redundant.  Suppose  you  have  the  following  chemical  reactions. 

CO  T  \02  — ^  CO2 

h2  +  \o2^h2o 

CH4  +  \02^C0  +  2H20 
CH4  +  2O2  -A  C02  +  2H20 

There  are  four  chemical  reactions  here  but  they  are  not  independent  reactions.  There  is  some  redundancy. 
What  are  the  independent  reactions?  Is  there  a  way  to  consider  a  shorter  list  of  reactions?  To  analyze  this 
situation,  we  can  write  the  reactions  in  a  matrix  as  follows 


CO 

02 

C02 

H2 

h2o 

ch4 

1 

1/2 

-1 

0 

0 

0 

0 

1/2 

0 

1 

-1 

0 

-1 

3/2 

0 

0 

-2 

1 

0 

2 

-1 

0 

-2 

1 

Each  row  contains  the  coefficients  of  the  respective  elements  in  each  reaction.  For  example,  the  top 
row  of  numbers  comes  from  CO  l\02  —  CO2  —  O  which  represents  the  first  of  the  chemical  reactions. 


We  can  write  these  coefficients  in 

the  following  matrix 

1 

1/2 

-1 

0 

0 

0 

0 

1/2 

0 

1 

-1 

0 

-1 

3/2 

0 

0 

-2 

1 

0 

2 

-1 

0 

-2 

1 

4.10.  Spanning,  Linear  Independence  and  Basis  in  R"  201 


Rather  than  listing  all  of  the  reactions  as  above,  it  would  be  more  efficient  to  only  list  those  which  are 
independent  by  throwing  out  that  which  is  redundant.  We  can  use  the  concepts  of  the  previous  section  to 
accomplish  this. 

First,  take  the  reduced  row-echelon  form  of  the  above  matrix. 


1 

0 

0 

3 

-1 

-1 

0 

1 

0 

2 

-2 

0 

0 

0 

1 

4 

-2 

-1 

0 

0 

0 

0 

0 

0 

The  top  three  rows  represent  “independent"  reactions  which  come  from  the  original  four  reactions.  One 
can  obtain  each  of  the  original  four  rows  of  the  matrix  given  above  by  taking  a  suitable  linear  combination 
of  rows  of  this  reduced  row-echelon  form  matrix. 

With  the  redundant  reaction  removed,  we  can  consider  the  simplified  reactions  as  the  following  equa¬ 
tions 

CO  +  3 H2  -  \H20  -  \CH4  =  0 
02  +  2 H2  -  2 H20  =  0 
C02  +  4 H2  -  2 H20  -  1CH4  =  0 

In  terms  of  the  original  notation,  these  are  the  reactions 

C0  +  3H2^H20  +  CH4 
02  +  2H2  -r  2H20 
C02  +  4 H2  -f  2 H20  +  CH4 

These  three  reactions  provide  an  equivalent  system  to  the  original  four  equations.  The  idea  is  that,  in 
terms  of  what  happens  chemically,  you  obtain  the  same  information  with  the  shorter  list  of  reactions.  Such 
a  simplification  is  especially  useful  when  dealing  with  very  large  lists  of  reactions  which  may  result  from 
experimental  evidence. 

4.10.4.  Subspaces  and  Basis 


The  goal  of  this  section  is  to  develop  an  understanding  of  a  subspace  of  M.n.  Before  a  precise  definition  is 
considered,  we  first  examine  the  subspace  test  given  below. 


This  test  allows  us  to  determine  if  a  given  set  is  a  subspace  of  M".  Notice  that  the  subset  V  —  |o|  is  a 
subspace  of  W1  (called  the  zero  subspace  ),  as  is  M”  itself.  A  subspace  which  is  not  the  zero  subspace  of 
R'7  is  referred  to  as  a  proper  subspace. 
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A  subspace  is  simply  a  set  of  vectors  with  the  property  that  linear  combinations  of  these  vectors  remain 
in  the  set.  Geometrically  in  R3,  it  turns  out  that  a  subspace  can  be  represented  by  either  the  origin  as  a 
single  point,  lines  and  planes  which  contain  the  origin,  or  the  entire  space  R3 . 

Consider  the  following  example  of  a  line  in  R3 . 


Example  4.76:  Subspace  of  1 


In  R3 ,  the  line  L  through  the  origin  that  is  parallel  to  the  vector  d  = 


,t  el,  so 


-5 

1 

-4 


has  (vector)  equation 


X 

'  -5  ' 

y 

=  t 

1 

z 

-4 

L=  ltd  \t  E 


Then  L  is  a  subspace  o/'R3. 


Solution.  Using  the  subspace  test  given  above  we  can  verify  that  L  is  a  subspace  of  M3. 

•  First:  O3  G  L  since  Od  =  O3. 

•  Suppose  u,  v  G  L.  Then  by  definition,  u  =  sd  and  v  =  t d,  for  some  s,/6l.  Thus 

u  +  v  —  sd  +  td  =  (s  +  t)d. 

Since  s  +  Ul,M  +  v6i;  i.e.,  L  is  closed  under  addition. 

•  Suppose  u  G  L  and  k  G  R  (k  is  a  scalar).  Then  u  —  td,  for  some  t  G  M,  so 

ku  =  k(td )  =  (kt)d. 

Since  kt  G  R,  ku  G  L\  i.e.,  L  is  closed  under  scalar  multiplication. 

Since  L  satisfies  all  conditions  of  the  subspace  test,  it  follows  that  L  is  a  subspace.  4 

Note  that  there  is  nothing  special  about  the  vector  d  used  in  this  example;  the  same  proof  works  for 
any  nonzero  vector  d  G  M3,  so  any  line  through  the  origin  is  a  subspace  of  R3. 

We  are  now  prepared  to  examine  the  precise  definition  of  a  subspace  as  follows. 


Definition  4.77:  Subspace 


Let  V  be  a  nonempty  collection  of  vectors  in  R'7.  Then  V  is  called  a  subspace  if  whenever  a  and  b 
are  scalars  and  u  and  v  are  vectors  in  V,  the  linear  combination  au  +  bv  is  also  in  V. 


More  generally  this  means  that  a  subspace  contains  the  span  of  any  finite  collection  vectors  in  that 
subspace.  It  turns  out  that  in  R'7,  a  subspace  is  exactly  the  span  of  finitely  many  of  its  vectors. 
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Theorem  4.78:  Subspaces  are  Spans 


Let  V  be  a  nonempty  collection  of  vectors  in  M'!.  Then  V  is  a  subspace  ofR'1  if  and  only  if  there 
exist  vectors  {U\,  ■  ■  ■  ,uk}  in  V  such  that 

V  =  span{U i,---  ,Uk} 

Furthermore,  let  W  be  another  subspace  ofW 1  and  suppose  {u\,-  ■  ■ ,  uk}  G  W.  Then  it  follows  that 
V  is  a  subset  ofW. 


Note  that  since  W  is  arbitrary,  the  statement  that  V  C  W  means  that  any  other  subspace  of  R”  that 
contains  these  vectors  will  also  contain  V. 

Proof.  We  first  show  that  if  V  is  a  subspace,  then  it  can  be  written  as  V  =  span  {mi,  •  •  • ,  uk}.  Pick  a  vector 
u\  in  V.  If  V  =  span{iii} ,  then  you  have  found  your  list  of  vectors  and  are  done.  If  V  yb  span{Mi} ,  then 
there  exists  m  a  vector  of  V  which  is  not  in  span  { u  \ }  .  Consider  span  { u  \ ,  U2  } .  If  V  =  span  {  u  \ ,  m  } ,  we  are 
done.  Otherwise,  pick  M3  not  in  span {u\, m} .  Continue  this  way.  Note  that  since  V  is  a  subspace,  these 
spans  are  each  contained  in  V.  The  process  must  stop  with  u ^  for  some  k  <  n  by  Corollary  4.70,  and  thus 
V  =  span{wi,---  ,Uk}. 

Now  suppose  V  =  span{iii,  •  •  • , uk},  we  must  show  this  is  a  subspace.  So  let  c,m7-  and  £f=1  djUi  be 
two  vectors  in  V,  and  let  a  and  b  be  two  scalars.  Then 

k  k  k 

a  ^ cm  +  b  ^  dm  —  ^  ( aci  +  bd,)  Uj 
i=  1  (=1  ;=1 

which  is  one  of  the  vectors  in  span  {u\,  •  •  • ,  uk}  and  is  therefore  contained  in  V.  This  shows  that  span  {u\,  •  •  • , 
has  the  properties  of  a  subspace. 

To  prove  that  V  C  IV.  we  prove  that  if  u,  G  L,  then  ul  G  IV. 

Suppose  u  G  V.  Then  u  —  a\U\  +CI2U2  H - b  akuk  for  some  at  G  K,  1  <  i  <  k.  Since  W  contain  each 

Uj  and  W  is  a  vector  space,  it  follows  that  ci\U\  +CI2U2  4 - b akuk  G  W.  4 

Since  the  vectors  Uj  we  constructed  in  the  proof  above  are  not  in  the  span  of  the  previous  vectors  (by 
definition),  they  must  be  linearly  independent  and  thus  we  obtain  the  following  corollary. 


Corollary  4.79:  Subspaces  are  Spans  of  Independent  Vectors 


IfV  is  a  subspace  ofRn,  then  there  exist  linearly  independent  vectors  {U\,-  ■  ■ , uk }  in  V  such  that 
V  =  span{ui,---  ,uk}. 


In  summary,  subspaces  of  R'7  consist  of  spans  of  finite,  linearly  independent  collections  of  vectors  of 
R'\  Such  a  collection  of  vectors  is  called  a  basis. 
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The  following  is  a  simple  but  very  useful  example  of  a  basis,  called  the  standard  basis. 


Definition  4.81:  Standard  Basis  of  M" 


Letei  be  the  vector  in  M'!  which  has  a  1  in  the  ith  entry  and  zeros  elsewhere,  that  is  the  ith  column  of 
the  identity  matrix.  Then  the  collection  {e\,e2,  ■  ■  ■ , en }  is  a  basis  for  R"  and  is  called  the  standard 
basis  ofM". 


The  main  theorem  about  bases  is  not  only  they  exist,  but  that  they  must  be  of  the  same  size.  To  show 
this,  we  will  need  the  the  following  fundamental  result,  called  the  Exchange  Theorem. 


Theorem  4.82:  Exchange  Theorem 


Suppose  {«],-•• ,  ur}  is  a  linearly  independent  set  of  vectors  in  W\  and  each  uk  is  contained  in 
span  {vi,  ■■■,?,}  Then  s>r. 

In  words,  spanning  sets  have  at  least  as  many  vectors  as  linearly  independent  sets. 


Proof.  Since  each  Uj  is  in  spanjvi,  •  •  •  ,v5},  there  exist  scalars  such  that 

5 

Uj  = 

l=l 

Suppose  for  a  contradiction  that  s  <  r.  Then  the  matrix  A  —  \ui  j_  has  fewer  rows,  s  than  columns,  r.  Then 
the  system  AX  =  0  has  a  non  trivial  solution  d,  that  is  there  is  a  c/  /  0  such  that  Ad  —  0.  In  other  words, 

r 

Y,  aijdj  =  o,  /  =  1,2,  - ••  ,5 

7=1 

Therefore, 

r  r  s 

Y  dJuJ  =  LdjLa^i 

7=1  7=1  i=  1 

-  Y  ( E addj]  R  =  E Ovf  =  0 

*=1  \7=1  /  1=1 

which  contradicts  the  assumption  that  {u\,  ■  ■  ■  ,Ur}  is  linearly  independent,  because  not  all  the  dj  are  zero. 
Thus  this  contradiction  indicates  that  s  >  r. 
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We  are  now  ready  to  show  that  any  two  bases  are  of  the  same  size. 


Theorem  4.83:  Bases  of  R'!  are  of  the  Same  Size 


LetV  be  a  subspace  ofR"  with  two  bases  B\  and  Bi.  Suppose  B \  contains  s  vectors  and  Bi  contains 
r  vectors.  Then  s  —  r. 


Proof.  This  follows  right  away  from  Theorem  9.35.  Indeed  observe  that  B\  —  {mi,  •  •  •  ,us}  is  a  spanning 
set  for  V  while  #2  =  (vt ,  •  •  • ,  vr}  is  linearly  independent,  so  s>r.  Similarly  ZL  —  {vi ,  ■  •  • ,  vr}  is  a  spanning 
set  for  V  while  B\  =  {u\,  ■  ■  ■  ,us}  is  linearly  independent,  so  r  >  i.  4 

The  following  definition  can  now  be  stated. 


Definition  4.84:  Dimension  of  a  Subspace 


Let  V  be  a  subspace  of  M'!.  Then  the  dimension  ofV,  written  diin(V)  is  defined  to  be  the  number 
of  vectors  in  a  basis. 


The  next  result  follows. 


Proof.  You  only  need  to  exhibit  a  basis  for  R"  which  has  n  vectors.  Such  a  basis  is  the  standard  basis 
\Tl ,  ‘  ‘  ‘  ,  &n} •  ^ 

Consider  the  following  example. 


Solution.  The  condition  a  —  b  —  d  —  c  is  equivalent  to  the  condition  a  —  b  —  c  +  d,  so  we  may  write 


V  = 


b  —  c  +  d 
b 
c 
d 


:  b,c,d  G 


r 

■  i  ■ 

"  -1  ' 

'  1  ■ 

i 

0 

0 

H 

b 

0 

+  c 

1 

+  d 

0 

{ 

0 

0 

1 

:  b,c,d  G 


This  shows  that  V  is  a  subspace  of  M4,  since  V  =  span{Mi,M2,M3}  where 
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'  1  ' 

"  -1  ■ 

'  1  ' 

1 

0 

,«3  = 

0 

Ml  = 

0 

,U2  = 

1 

0 

0 

0 

1 

Furthermore, 


is  linearly  independent,  as  can  be  seen  by  taking  the  reduced  row-echelon  form  of  the  matrix  whose 
columns  are  U\,U2  and  M3. 


■  1 

-1 

1  ■ 

■  1 

0 

0  ' 

1 

0 

0 

-A 

0 

1 

0 

0 

1 

0 

0 

0 

1 

0 

0 

1 

0 

0 

0 

Since  every  column  of  the  reduced  row-echelon  form  matrix  has  a  leading  one,  the  columns  are 
linearly  independent. 

Therefore  {u\,U2,U2,}  is  linearly  independent  and  spans  V,  so  is  a  basis  of  V.  Hence  V  has  dimension 
three.  4k 

We  continue  by  stating  further  properties  of  a  set  of  vectors  in  Mn. 


Proof.  Assume  first  that  {u\,  ■  ■  ■ ,  un }  is  linearly  independent,  and  we  need  to  show  that  this  set  spans  R'!. 
To  do  so,  let  v  be  a  vector  of  M'7,  and  we  need  to  write  v  as  a  linear  combination  of  m,’s.  Consider  the 
matrix  A  having  the  vectors  m;-  as  columns: 


A  =  [  Ml  •  •  •  un] 

By  linear  independence  of  the  m,’s,  the  reduced  row-echelon  form  of  A  is  the  identity  matrix.  Therefore 
the  system  Ax  =  v  has  a  (unique)  solution,  so  v  is  a  linear  combination  of  the  m,’s. 

To  establish  the  second  claim,  suppose  that  m  <  n.  Then  letting  wq ,  •  •  • ,  Uik  be  the  pivot  columns  of  the 
matrix 

|  u\  ■ ■  ■  um  J 
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it  follows  k  <  m  <  n  and  these  k  pivot  columns  would  be  a  basis  for  R"  having  fewer  than  n  vectors, 
contrary  to  Corollary  4.85. 

Finally  consider  the  third  claim.  If  {u\,  -  ■■  ,un}  is  not  linearly  independent,  then  replace  this  list  with 
{uil ,  •  •  • ,  u,k }  where  these  are  the  pivot  columns  of  the  matrix 

[u\  un  ] 

Then  {w(1  ,  •  •  • ,  ulk }  spans  K'!  and  is  linearly  independent,  so  it  is  a  basis  having  less  than  n  vectors  again 
contrary  to  Corollary  4.85.  4|k 

The  next  theorem  follows  from  the  above  claim. 


Theorem  4.88:  Existence  of  Basis 


LetV  be  a  subspace  of  R".  Then  there  exists  a  basis  ofV  with  dim(V)  <  n. 


Consider  Corollary  4.87  together  with  Theorem  4.88.  Let  dim(F)  =  r.  Suppose  there  exists  an  inde¬ 
pendent  set  of  vectors  in  V.  If  this  set  contains  r  vectors,  then  it  is  a  basis  for  V.  If  it  contains  less  than  r 
vectors,  then  vectors  can  be  added  to  the  set  to  create  a  basis  of  V.  Similarly,  any  spanning  set  of  V  which 
contains  more  than  r  vectors  can  have  vectors  removed  to  create  a  basis  of  V. 

We  illustrate  this  concept  in  the  next  example. 


Solution.  To  extend  S  to  a  basis  of  U,  find  a  vector  in  U  that  is  not  in  span(S) . 

'12?' 

1  3  ? 

1  3  ? 

1  2  ? 
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"  1 

2 

1  ' 

'  1 

0 

0  ' 

1 

3 

0 

0 

1 

0 

1 

3 

-1 

0 

0 

1 

1 

2 

0 

0 

0 

0 

Therefore,  S  can  be  extended  to  the  following  basis  of  U : 


Next  we  consider  the  case  of  removing  vectors  from  a  spanning  set  to  result  in  a  basis. 


Theorem  4.90:  Finding  a  Basis  from  a  Span 


Let  W  be  a  subspace.  Also  suppose  that  W  —  span{w i,--- , wm } .  Then  there  exists  a  subset  of 
{w i ,  •  •  • ,  wm}  which  is  a  basis  for  W. 


Proof.  Let  S  denote  the  set  of  positive  integers  such  that  for  k  e  S,  there  exists  a  subset  of  {w\,  ■  ■  ■ ,  wm } 
consisting  of  exactly  k  vectors  which  is  a  spanning  set  for  W.  Thus  m  e  S.  Pick  the  smallest  positive 
integer  in  S.  Call  it  k.  Then  there  exists  {Ui,  ■  ■  ■ ,%}  C  {w i,  •  •  ■ ,  wm\  such  that  span{i/i,  •  •  •  ,11^}  —  W.  If 

k 

CjWi  =  0 

(=1 

and  not  all  of  the  c,-  =  0,  then  you  could  pick  cj  ^  0,  divide  by  it  and  solve  for  uj  in  terms  of  the  others, 


Then  you  could  delete  wj  from  the  list  and  have  the  same  span.  Any  linear  combination  involving  wj 
would  equal  one  in  which  wj  is  replaced  with  the  above  sum,  showing  that  it  could  have  been  obtained  as 
a  linear  combination  of  wy  for  i  ^  j.  Thus  k  —  1  6  S  contrary  to  the  choice  of  k  .  Hence  each  c,-  =  0  and  so 
{mi,--  •  ,Uk}  is  a  basis  for  W  consisting  of  vectors  of  {vvi,-  •  •  ,wm}.  4 


The  following  example  illustrates  how  to  carry  out  this  shrinking  process  which  will  obtain  a  subset  of 
a  span  of  vectors  which  is  linearly  independent. 
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Solution.  You  can  use  the  reduced  row-echelon  form  to  accomplish  this  reduction.  Form  the  matrix  which 
has  the  given  vectors  as  columns. 


Then  take  the  reduced  row-echelon  form 


It  follows  that  a  basis  for  W  is 


1 

1 

8  - 

-6 

1 

1  ' 

2 

19  - 

15 

3 

5 

-1 

-1 

-8 

6 

0 

0 

1 

1 

8  - 

6 

1 

1 

rm 

'  1 

0 

5 

-3 

0 

-2  ' 

0 

1 

3 

-3 

0 

2 

0 

0 

0 

0 

1 

1 

_  0 

0 

0 

0 

0 

0  _ 

f 

r 

1 ' 

1  ' 

'  1  ' 

) 

2 

3 

3 

1 

9 

-1 

9 

0 

1 

1 

1 

Since  the  first,  second,  and  fifth  columns  are  obviously  a  basis  for  the  column  space  of  the  reduced  row- 
echelon  form,  the  same  is  true  for  the  matrix  having  the  given  vectors  as  columns.  4k 


Consider  the  following  theorems  regarding  a  subspace  contained  in  another  subspace. 


Theorem  4.92:  Subset  of  a  Subspace 


Let  V  and  W  be  subspaces  and  suppose  that  W  C  V.  Then  dim  (IF)  <  dim(V)  with  equality 

when  W  —  V. 


Theorem  4.93:  Extending  a  Basis 


Let  W  be  any  non-zero  subspace  E"  and  let  W  C  V  where  V  is  also  a  subspace  of  E".  Then  every 
basis  ofW  can  be  extended  to  a  basis  forV. 


The  proof  is  left  as  an  exercise  but  proceeds  as  follows.  Begin  with  a  basis  for  ,ws}  and 

add  in  vectors  from  V  until  you  obtain  a  basis  for  V.  Not  that  the  process  will  stop  because  the  dimension 
of  V  is  no  more  than  n. 

Consider  the  following  example. 


Example  4.94:  Extending  a  Basis 


Let  V  =  M4  and  let 


W  —  span 


f 

'  1  ' 

'  0  ' 

1 

1 

0 

1 

1 

1 

9 

0 

i 

1 

1 

1 

Extend  this  basis  ofW  to  a  basis  ofW\ 
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Solution.  An  easy  way  to  do  this  is  to  take  the  reduced  row-echelon  form  of  the  matrix 


10  10  0  0 
0  10  10  0 
1  0  0  0  1  0 
1  1  0  0  0  1 


(4.17) 


Note  how  the  given  vectors  were  placed  as  the  first  two  columns  and  then  the  matrix  was  extended  in  such 
a  way  that  it  is  clear  that  the  span  of  the  columns  of  this  matrix  yield  all  of  M4.  Now  determine  the  pivot 
columns.  The  reduced  row-echelon  form  is 


1  0  0  0  1  0 

0100-1  1 
0  0  10-1  0 
0  0  0  1  1  -1 


(4.18) 


Therefore  the  pivot  columns  are 


1 

0 

1 

0 

0 

1 

0 

1 

1 

? 

0 

0 

0 

1 

1 

0 

0 

and  now  this  is  an  extension  of  the  given  basis  for  IT  to  a  basis  for  K4. 

Why  does  this  work?  The  columns  of  4.17  obviously  span  M4.  In  fact  the  span  of  the  first  four  is  the 
same  as  the  span  of  all  six.  4k 


Consider  another  example. 


Solution.  Note  that  the  above  vectors  are  not  linearly  independent,  but  their  span,  denoted  as  V  is  a 
subspace  which  does  include  the  subspace  W. 

Using  the  process  outlined  in  the  previous  example,  form  the  following  matrix 

'  1  0  7  -5  0  ' 

0  1-6  7  0 

11  1  2  0 

0  1-6  7  1 
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Next  find  its  reduced  row-echelon  form 

"  1  0  7  -5  0  " 

0  1-6  7  0 

0  0  0  0  1 

0  0  0  0  0  _ 

It  follows  that  a  basis  for  V  consists  of  the  first  two  vectors  and  the  last. 


Thus  V  is  of  dimension  3  and  it  has  a  basis  which  extends  the  basis  for  W. 


* 


4.10.5.  Row  Space,  Column  Space,  and  Null  Space  of  a  Matrix 


We  begin  this  section  with  a  new  definition. 


Definition  4.96:  Row  and  Column  Space 


Let  A  be  an  m  x  n  matrix.  The  column  space  of  A,  written  col(A ) ,  is  the  span  of  the  columns.  The 
row  space  of  A,  written  row(A),  is  the  span  of  the  rows. 


Using  the  reduced  row-echelon  form  ,  we  can  obtain  an  efficient  description  of  the  row  and  column 
space  of  a  matrix.  Consider  the  following  lemma. 


Lemma  4.97:  Effect  of  Row  Operations  on  Row  Space 


Let  A  andB  bem  x  n  matrices  such  that  A  can  be  carried  toB  by  elementary  row  [column]  operations. 
Then  row(A)  —  row(B)  [col (A)  =  col(B)]. 


Proof.  We  will  prove  that  the  above  is  true  for  row  operations,  which  can  be  easily  applied  to  column 
operations. 

Let  n,?2, . . . ,  rm  denote  the  rows  of  A. 

•  If  B  is  obtained  from  A  by  a  interchanging  two  rows  of  A,  then  A  and  B  have  exactly  the  same  rows, 
so  row(5)  =  row(A). 

•  Suppose  p  y^O,  and  suppose  that  for  some  j,  1  <  j  <  m,  B  is  obtained  from  A  by  multiplying  row  j 
by  p.  Then 


row(B)  =  spanjri, . .  .,pfj, . . 
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Since 

,prj, . .  .,rm}  C  row  (A), 

it  follows  that  row (B)  C  row(A).  Conversely,  since 

{r\,...,rm}  Crow(£), 

it  follows  that  row(A)  C  row (/l).  Therefore,  row (/l)  =  row(A). 

•  Suppose  p^O,  and  suppose  that  for  some  i  and  j,  1  <  i,  j  <  m,  B  is  obtained  from  A  by  adding  p 
time  row  j  to  row  i.  Without  loss  of  generality,  we  may  assume  i  <  j. 

Then 

row (5)  =  spanjn, . . .  ,rf_i,rf  +  prj, . . .  ,rj, . . . 


Since 


{?!,.. +  prj, . .  .,rm}  c  row  (A), 

it  follows  that  row(5)  C  row(A). 

Conversely,  since 


{r\,...,rm}  Crow (5), 

it  follows  that  row(A)  C  row (B).  Therefore,  row (B)  —  row(A). 


4 


Consider  the  following  lemma. 


Lemma  4.98:  Row  Space  of  a  Row-Echelon  Form  Matrix 


Let  A  be  an  m  x  n  matrix  and  let  R  be  its  row-echelon  form.  Then  the  nonzero  rows  of  R  form  a 
basis  of  row(R) ,  and  consequently  of  row(A) . 


This  lemma  suggests  that  we  can  examine  the  row-echelon  form  of  a  matrix  in  order  to  obtain  the 
row  space.  Consider  now  the  column  space.  The  column  space  can  be  obtained  by  simply  saying  that  it 
equals  the  span  of  all  the  columns.  However,  you  can  often  get  the  column  space  as  the  span  of  fewer 
columns  than  this.  A  variation  of  the  previous  lemma  provides  a  solution.  Suppose  A  is  row  reduced  to 
its  row-echelon  form  R.  Identify  the  pivot  columns  of  R  (columns  which  have  leading  ones),  and  take  the 
corresponding  columns  of  A.  It  turns  out  that  this  forms  a  basis  of  col(A). 

Before  proceeding  to  an  example  of  this  concept,  we  revisit  the  definition  of  rank. 
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Definition  4.99:  Rank  of  a  Matrix 


Previously,  we  defined  rank(A)  to  be  the  number  of  leading  entries  in  the  row-echelon  form  of  A. 
Using  an  understanding  of  dimension  and  row  space,  we  can  now  define  rank  as  follows: 

rank(A)  —  dim(row(A)) 


Consider  the  following  example. 


r  i 

Example  4.100:  Rank,  Column  and  Row  Space 

Find  the  rank  of  the  following  matrix  and  < 

A  = 

describe  the  colurr. 

"12132" 
1  3  6  0  2 

3  7  8  6  6 

in  and  row  spaces. 

Solution.  The  reduced  row-echelon  form  of  A  is 

"  1  0  -9  9  2  " 

01  5-30 

0  0  0  0  0 

Therefore,  the  rank  is  2. 

Notice  that  the  first  two  columns  of  R  are  pivot  columns.  By  the  discussion  following  Lemma  4.98, 
we  find  the  corresponding  columns  of  A,  in  this  case  the  first  two  columns.  Therefore  a  basis  for  col(A)  is 
given  by 


For  example,  consider  the  third  column  of  the  original  matrix.  It  can  be  written  as  a  linear  combination 
of  the  first  two  columns  of  the  original  matrix  as  follows. 


'  1  ' 

'  1  ' 

'  2  ' 

6 

8 

=  -9 

1 

3 

+  5 

3 

7 

What  about  an  efficient  description  of  the  row  space?  By  Lemma  4.98  we  know  that  the  nonzero  rows 
of  R  create  a  basis  of  row(A).  For  the  above  matrix,  the  row  space  equals 

row(A)  =  span  {  [  1  0  -9  9  2  ] ,  [  0  1  5  -3  0  ] } 


* 


Notice  that  the  column  space  of  A  is  given  as  the  span  of  columns  of  the  original  matrix,  while  the  row 
space  of  A  is  the  span  of  rows  of  the  reduced  row-echelon  form  of  A. 

Consider  another  example. 
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1 

Example  4.101:  Rank,  Column  and  Row  Space 

Find  the  rank  of  the  following  matrix  an 

d  describe  the  coli 

'12132' 
1  3  6  0  2 
12  13  2 

1  3  2  4  0 

mm  and  row  spaces. 

Solution.  The  reduced  row-echelon  form  is 

'  1  0  0  0  f- 

010  2  -§ 

001-1  \ 

0  0  0  0  0 

and  so  the  rank  is  3.  The  row  space  is  given  by 

row(A)  =  span  {  [  1  0  0  0  ^  ] ,  [  0  1  0  2  —  |  ] ,  [  0  0  1  —1  \  ]  } 

Notice  that  the  first  three  columns  of  the  reduced  row-echelon  form  are  pivot  columns.  The  column  space 
is  the  span  of  the  first  three  columns  in  the  original  matrix, 


col  (A)  =  span 


2 

3 

2 

3 


1 

6 

1 

2 


* 

Consider  the  solution  given  above  for  Example  4.101,  where  the  rank  of  A  equals  3.  Notice  that  the 
row  space  and  the  column  space  each  had  dimension  equal  to  3.  It  turns  out  that  this  is  not  a  coincidence, 
and  this  essential  result  is  referred  to  as  the  Rank  Theorem  and  is  given  now.  Recall  that  we  defined 
rank(A)  =  dim(row(A)). 


Theorem  4.102:  Rank  Theorem 


Let  A  be  an  m  x  n  matrix.  Then  dim(col(A)) ,  the  dimension  of  the  column  space,  is  equal  to  the 
dimension  of  the  row  space,  dim(row(A)) . 


The  following  statements  all  follow  from  the  Rank  Theorem. 
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Consider  the  following  example. 


Solution.  To  find  rank(A)  we  first  row  reduce  to  find  the  reduced  row-echelon  form. 


1 

2  ' 

'  1 

0  ' 

A  = 

-1 

1 

-A  • 

•  -A 

0 

1 

Therefore  the  rank  of  A  is  2.  Now  consider  AT  given  by 


Again  we  row  reduce  to  find  the  reduced  row-echelon  form. 


1  -1 

2  1 


-A - > 


1  0 
0  1 


You  can  see  that  rank (A 7 )  =  2,  the  same  as  rank(A). 

We  now  define  what  is  meant  by  the  null  space  of  a  general  m  x  n  matrix. 


* 


It  can  also  be  referred  to  using  the  notation  ker  (A) .  Similarly,  we  can  discuss  the  image  of  A,  denoted 
by  im  (A).  The  image  of  A  consists  of  the  vectors  of  M"!  which  “get  hit”  by  A.  The  formal  definition  is  as 
follows. 
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Consider  A  as  a  mapping  from  M'7  to  Rm  whose  action  is  given  by  multiplication, 
diagram  displays  this  scenario. 


null(A)  im(A) 

4  R'n 


The  following 


As  indicated,  im  (A)  is  a  subset  of  Wn  while  null  (A)  is  a  subset  of  W1. 

It  turns  out  that  the  null  space  and  image  of  A  are  both  subspaces.  Consider  the  following  example. 


Solution. 


•  Since  A0„  =  0m,  0n  G  null  (A). 

•  Let  x,y  G  null(A).  Then  Ax  =  0m  and  Ay  —  0m,  so 

A(x  +  y )  =  Ax + Ay  —  0m  +  0m  —  ()m. 


and  thus  x  +  y  G  null(A). 

•  Let  x  G  null(A)  and  IgM.  Then  Ax  —  0m,  so 

A{kx)  =  k{Ax)  —  kOm  —  dm, 


and  thus  kx  G  null  (A). 

Therefore  by  the  subspace  test,  null  (A)  is  a  subspace  of  M". 


4 


The  proof  that  im(A)  is  a  subspace  of  Rm  is  similar  and  is  left  as  an  exercise  to  the  reader. 

We  now  wish  to  find  a  way  to  describe  null(A)  for  a  matrix  A.  However,  finding  null  (A)  is  not  new! 
There  is  just  some  new  terminology  being  used,  as  null  (A)  is  simply  the  solution  to  the  system  Ax  =  0. 


Theorem  4.108:  Basis  of  null(A) 


Let  A  be  an  m  x  n  matrix  such  that  rank(A)  =  r.  Then  the  system  Ax  —  0,„  has  n  —  r  basic  solutions, 
providing  a  basis  of  null(A)  with  dim (null(A))  —  n  —  r. 


Consider  the  following  example. 
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Solution.  In  order  to  find  null  (A),  we  simply  need  to  solve  the  equation  Ax  —  0.  This  is  the  usual  proce¬ 
dure  of  writing  the  augmented  matrix,  finding  the  reduced  row-echelon  form  and  then  the  solution.  The 
augmented  matrix  and  corresponding  reduced  row-echelon  form  are 


"  1 

2 

1 

0  ' 

"  1 

0 

3 

0  ' 

0 

-1 

1 

0 

-A  • 

•  -A 

0 

1 

-1 

0 

2 

3 

3 

0 

0 

0 

0 

0 

The  third  column  is  not  a  pivot  column,  and  therefore  the  solution  will  contain  a  parameter, 
to  the  system  Ax  —  0  is  given  by 


—3 t 


The  solution 


t 

t 


:tel 


which  can  be  written  as 


:teR 


Therefore,  the  null  space  of  A  is  all  multiples  of  this  vector,  which  we  can  write  as 


null  (A) 


-3 

1 

1 


Finally  im  (A)  is  just  {Ax  :  x  G  M"}  and  hence  consists  of  the  span  of  all  columns  of  A,  that  is  im  (A)  = 
col  (A). 

Notice  from  the  above  calculation  that  that  the  first  two  columns  of  the  reduced  row-echelon  form  are 
pivot  columns.  Thus  the  column  space  is  the  span  of  the  first  two  columns  in  the  original  matrix,  and  we 
get 

im(A)  =  col(A)  =  span 

* 


Here  is  a  larger  example,  but  the  method  is  entirely  similar. 
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Solution.  To  find  the  null  space,  we  need  to  solve  the  equation  AX  —  0.  The  augmented  matrix  and 
corresponding  reduced  row-echelon  form  are  given  by 


1 

2 

1 

0 

1 

0  ' 

■  1 

0 

3 

5 

6 

5 

1 

5 

0  " 

2 

-1 

1 

3 

0 

0 

0 

1 

1 

_ 3 

2 

0 

1 

1 

0 

->■  • 

•  -> 

5 

5 

5 

3 

2 

3 

4 

-2 

2 

6 

0 

0 

0 

0 

0 

0 

0 

0 

_  0 

0 

0 

0 

0 

0  _ 

It  follows  that  the  first  two  columns  are  pivot  columns,  and  the  next  three  correspond  to  parameters. 
Therefore,  null  (A)  is  given  by 


We  write  this  in  the  form 


§)*+(-!)'+(!) 
(—!)*+«)'+(— 1) 


:  s,t,r  e 


r  3  i 

r  6  i 

r  1  i 

5 

5 

5 

i 

3 

2 

5 

5 

5 

1 

+ 1 

0 

+  r 

0 

0 

1 

0 

0 

0 

1 

:  s,t,r  e 


In  other  words,  the  null  space  of  this  matrix  equals  the  span  of  the  three  vectors  above.  Thus 


null  (A)  =  span  < 


r  3  i 

r  6 1 

r  1  i 

5 

5 

5 

1 

3 

2 

5 

5 

5 

1 

9 

0 

9 

0 

0 

1 

0 

0 

0 

1 

> 

A 

Notice  also  that  the  three  vectors  above  are  linearly  independent  and  so  the  dimension  of  null  (A)  is  3. 
The  following  is  true  in  general,  the  number  of  parameters  in  the  solution  of  AX  —  0  equals  the  dimension 
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of  the  null  space.  Recall  also  that  the  number  of  leading  ones  in  the  reduced  row-echelon  formequals  the 
number  of  pivot  columns,  which  is  the  rank  of  the  matrix,  which  is  the  same  as  the  dimension  of  either  the 
column  or  row  space. 

Before  we  proceed  to  an  important  theorem,  we  first  define  what  is  meant  by  the  nullity  of  a  matrix. 


Definition  4.111:  Nullity 


The  dimension  of  the  null  space  of  a  matrix  is  called  the  nullity,  denoted  dim  (null  (A)). 


From  our  observation  above  we  can  now  state  an  important  theorem. 


Consider  the  following  example,  which  we  first  explored  above  in  Example  4.109 


Solution.  In  the  above  Example  4.109  we  determined  that  the  reduced  row-echelon  form  of  A  is  given  by 

'10  3  ' 

0  1  -1 

0  0  0 


Therefore  the  rank  of  A  is  2.  We  also  determined  that  the  null  space  of  A  is  given  by 


null  (A) 


-3 

1 

1 


Therefore  the  nullity  of  A  is  1 .  It  follows  from  Theorem  4.112  that  rank  (A)  +  dim(null  (A) )  =  2  + 1  =  3, 
which  is  the  number  of  columns  of  A.  4k 

We  conclude  this  section  with  two  similar,  and  important,  theorems. 
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Exercises 


Exercise  4.10.1  Here  are  some  vectors. 


1 

1 

2 

5 

12 

1 

9 

2 

9 

7 

9 

7 

9 

17 

-2 

-2 

-4 

-10 

-24 

Describe  the  span  of  these  vectors  as  the  span  of  as  few  vectors  as  possible. 

Exercise  4.10.2  Here  are  some  vectors. 

1 

12 

1 

2 

5 

2 

9 

29 

9 

3 

9 

9 

9 

12 

-2 

-24 

-2 

-4 

-10 
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Describe  the  span  of  these  vectors  as  the  span  of  as  few  vectors  as  possible. 
Exercise  4.10.3  Here  are  some  vectors. 


Describe  the  span  of  these  vectors  as  the  span  of  as  few  vectors  as  possible. 
Exercise  4.10.4  Here  are  some  vectors. 


Now  here  is  another  vector: 

1  " 

2 

-1 

Is  this  vector  in  the  span  of  the  first  four  vectors?  If  it  is,  exhibit  a  linear  combination  of  the  first  four 
vectors  which  equals  this  vector,  using  as  few  vectors  as  possible  in  the  linear  combination. 

Exercise  4.10.5  Here  are  some  vectors. 


Now  here  is  another  vector: 

2  ' 

-3 

-4 

Is  this  vector  in  the  span  of  the  first  four  vectors?  If  it  is,  exhibit  a  linear  combination  of  the  first  four 
vectors  which  equals  this  vector,  using  as  few  vectors  as  possible  in  the  linear  combination. 

Exercise  4.10.6  Here  are  some  vectors. 


Now  here  is  another  vector: 

"  1  " 

9 

1 

Is  this  vector  in  the  span  of  the  first  four  vectors?  If  it  is,  exhibit  a  linear  combination  of  the  first  four 
vectors  which  equals  this  vector,  using  as  few  vectors  as  possible  in  the  linear  combination. 
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Exercise  4.10.7  Here  are  some  vectors. 

I  ]  r  1 1  r  1 1  r  -i  ' 

-1  ,  0,-5,  5 

-2  J  [  -2  J  [  -2  J  [  2 

Now  here  is  another  vector: 

1 
1 

-1 

Is  this  vector  in  the  span  of  the  first  four  vectors?  If  it  is,  exhibit  a  linear  combination  of  the  first  four 
vectors  which  equals  this  vector,  using  as  few  vectors  as  possible  in  the  linear  combination. 

Exercise  4.10.8  Here  are  some  vectors. 

I I  r  1 1  r  1 1  r  -i  ' 

-1  ,  0,-5,  5 

-2  J  [-2  \  [-2  \  [  2 

Now  here  is  another  vector: 

1 
1 

-1 

Is  this  vector  in  the  span  of  the  first  four  vectors?  If  it  is,  exhibit  a  linear  combination  of  the  first  four 
vectors  which  equals  this  vector,  using  as  few  vectors  as  possible  in  the  linear  combination. 

Exercise  4.10.9  Here  are  some  vectors. 

ill"  1  1  [  2  1  I"  -1  ' 

0  ,  1,-2,  4 

-2  J  L-2  J  L-3  J  L  2 

Now  here  is  another  vector: 

-1 
-4 
2 

Is  this  vector  in  the  span  of  the  first  four  vectors?  If  it  is,  exhibit  a  linear  combination  of  the  first  four 
vectors  which  equals  this  vector,  using  as  few  vectors  as  possible  in  the  linear  combination. 

Exercise  4.10.10  Suppose  {3ci,  •  •  •  ,xf\  is  a  set  of  vectors  from  Wl.  Show  that  0  is  in  span  {xi,  •  •  • ,  x^ }  . 

Exercise  4.10.11  Are  the  following  vectors  linearly  independent?  If  they  are,  explain  why  and  if  they  are 
not,  exhibit  one  of  them  as  a  linear  combination  of  the  others.  Also  give  a  linearly  independent  set  of 
vectors  which  has  the  same  span  as  the  given  vectors. 

ill"  1  1  [  1  1  [  1  ' 

3  4  4  10 

-1  ’  -1  ’  0  ’  2 

!  J  [  1  J  [  1  J  L  1 
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Exercise  4.10.12  Are  the  following  vectors  linearly  independent?  If  they  are,  explain  why  and  if  they  are 
not,  exhibit  one  of  them  as  a  linear  combination  of  the  others.  Also  give  a  linearly  independent  set  of 
vectors  which  has  the  same  span  as  the  given  vectors. 


"  -1 " 

'  -3  " 

0  ' 

0  ' 

-2 

-4 

-1 

-1 

2 

9 

3 

9 

4 

9 

6 

3 

3 

3 

4 

Exercise  4.10.13  Are  the  following  vectors  linearly  independent?  If  they  are,  explain  why  and  if  they  are 
not,  exhibit  one  of  them  as  a  linear  combination  of  the  others.  Also  give  a  linearly  independent  set  of 
vectors  which  has  the  same  span  as  the  given  vectors. 


1  ' 

1  ' 

"  -1  ' 

1  ' 

5 

6 

-4 

6 

-2 

9 

-3 

9 

1 

9 

-2 

1 

1 

-1 

1 

Exercise  4.10.14  Are  the  following  vectors  linearly  independent?  If  they  are,  explain  why  and  if  they  are 
not,  exhibit  one  of  them  as  a  linear  combination  of  the  others.  Also  give  a  linearly  independent  set  of 
vectors  which  has  the  same  span  as  the  given  vectors. 


1  ' 

1  ' 

'  1  ' 

"  1  " 

-1 

6 

0 

0 

3 

9 

34 

9 

7 

9 

8 

1 

1 

1 

1 

Exercise  4.10.15  Are  the  following  vectors  linearly  independent?  If  they  are,  explain  why  and  if  they  are 
not,  exhibit  one  of  them  as  a  linear  combination  of  the  others. 


1  ' 

1  ' 

-3  ' 

'  1  ' 

3 

4 

-10 

4 

-1 

9 

-1 

9 

3 

9 

0 

1 

1 

-3 

1 

Exercise  4.10.16  Are  the  following  vectors  linearly  independent?  If  they  are,  explcdn  why  and  if  they  are 
not,  exhibit  one  of  them  as  a  linear  combination  of  the  others.  Also  give  a  linearly  independent  set  of 
vectors  which  has  the  same  span  as  the  given  vectors. 


1  ' 

1  ' 

1  ' 

1  ' 

3 

4 

4 

10 

-3 

9 

-5 

9 

-4 

9 

-14 

1 

1 

1 

1 
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Exercise  4.10.17  Are  the  following  vectors  linearly  independent?  If  they  are,  explain  why  and  if  they  are 
not,  exhibit  one  of  them  as  a  linear  combination  of  the  others.  Also  give  a  linearly  independent  set  of 
vectors  which  has  the  same  span  as  the  given  vectors. 


'  1 ' 

'  1  ' 

1  ' 

'  1  ' 

0 

1 

7 

1 

3 

9 

8 

9 

34 

9 

7 

1 

1 

1 

1 

Exercise  4.10.18  Are  the  following  vectors  linearly  independent?  If  they  are,  explain  why  and  if  they  are 
not,  exhibit  one  of  them  as  a  linear  combination  of  the  others.  Also  give  a  linearly  independent  set  of 
vectors  which  has  the  same  span  as  the  given  vectors. 


1  ' 

1  ' 

1  ' 

1  ' 

4 

5 

7 

5 

-2 

9 

-3 

9 

-5 

9 

-2 

1 

1 

1 

1 

Exercise  4.10.19  Are  the  following  vectors  linearly  independent?  If  they  are,  explain  why  and  if  they  are 
not,  exhibit  one  of  them  as  a  linear  combination  of  the  others. 


1  ' 

3  ' 

0  ' 

0  ' 

2 

4 

-1 

-1 

2 

9 

1 

9 

0 

9 

-2 

-4 

-4 

4 

5 

Exercise  4.10.20  Are  the  following  vectors  linearly  independent?  If  they  are,  explain  why  and  if  they  are 
not,  exhibit  one  of  them  as  a  linear  combination  of  the  others.  Also  give  a  linearly  independent  set  of 
vectors  which  has  the  same  span  as  the  given  vectors. 


2  ' 

"  -5  ' 

'  -1  ' 

"  -1  ' 

3 

-6 

-2 

-2 

1 

9 

0 

9 

1 

9 

0 

-3 

3 

3 

4 

Exercise  4.10.21  Here  are  some  vectors  in  M4. 


1  ' 

1  ' 

1  ' 

'  1  ' 

1  ' 

1 

2 

-2 

2 

-1 

-1 

9 

-1 

9 

-1 

9 

0 

9 

-1 

1 

1 

1 

1 

1 

Tlise  vectors  can ’t possibly  be  linearly  independent.  Tell  why.  Next  obtain  a  linearly  independent  subset  of 
these  vectors  which  has  the  same  span  as  these  vectors.  In  other  words,  find  a  basis  for  the  span  of  these 
vectors. 
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Exercise  4.10.22  Here  are  some  vectors  in  R4. 

nr  nr  nr  4 1  r  n 

2  3  3  3  3 

-2  ’  -3  ’  -2  ’  -1  ’  -2 

1  j  [  1 J  L  1 J  L  4  J  L  1 

Thse  vectors  can ’t  possibly  be  linearly  independent.  Tell  why.  Next  obtain  a  linearly  independent  subset  of 
these  vectors  which  has  the  same  span  as  these  vectors.  In  other  words,  find  a  basis  for  the  span  of  these 
vectors. 

Exercise  4.10.23  Here  are  some  vectors  in  R4. 

' 1 1  r 1 1  r  nr  21  rn 

12-2-52 
0  ’  1  ’  -3  ’  -7  ’  2 

1 J  L 1 J  L  1 J  L  2  J  L 1 

Thse  vectors  can ’t possibly  be  linearly  independent.  Tell  why.  Next  obtain  a  linearly  independent  subset  of 
these  vectors  which  has  the  same  span  as  these  vectors.  In  other  words,  find  a  basis  for  the  span  of  these 
vectors. 

Exercise  4.10.24  Here  are  some  vectors  in  M4. 

nr  1 1  if  nr  2 1  r  n 

2  3-1-3  3 

-2  ’  -3  ’  1  ’  3  ’  -2 

1 J  [  1 J  [  1 J  [  2  J  [  1 

Thse  vectors  can ’t  possibly  be  linearly  independent.  Tell  why.  Next  obtain  a  linearly  independent  subset  of 
these  vectors  which  has  the  same  span  as  these  vectors.  In  other  words,  find  a  basis  for  the  span  of  these 
vectors. 

Exercise  4.10.25  Here  are  some  vectors  in  M4. 

nr  nr  nr  4 1  r  n 
4  5  5  11  5 

-2  ’  -3  ’  -2  ’  -1  ’  -2 

1  j  [  1 J  L  1 J  L  4  J  L  1 

Thse  vectors  can ’t  possibly  be  linearly  independent.  Tell  why.  Next  obtain  a  linearly  independent  subset  of 
these  vectors  which  has  the  same  span  as  these  vectors.  In  other  words,  find  a  basis  for  the  span  of  these 
vectors. 

Exercise  4.10.26  Here  are  some  vectors  in  M4. 
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Thse  vectors  can ’t possibly  be  linearly  independent.  Tell  why.  Next  obtain  a  linearly  independent  subset  of 
these  vectors  which  has  the  same  span  as  these  vectors.  In  other  words,  find  a  basis  for  the  span  of  these 
vectors. 

Exercise  4.10.27  Here  are  some  vectors  in  M4. 


1 ' 

1  ' 

1  ' 

2  ' 

'  1  ' 

3 

4 

0 

-1 

4 

-1 

9 

-1 

9 

-1 

9 

-2 

9 

0 

1 

1 

1 

2 

1 

Thse  vectors  can ’t possibly  be  linearly  independent.  Tell  why.  Next  obtain  a  linearly  independent  subset  of 
these  vectors  which  has  the  same  span  as  these  vectors.  In  other  words,  find  a  basis  for  the  span  of  these 
vectors. 

Exercise  4.10.28  Here  are  some  vectors  in  M4. 


1  ' 

1  ' 

"  1  ' 

'  2  ' 

1  ' 

4 

5 

1 

1 

5 

-2 

9 

-3 

9 

1 

9 

3 

9 

-2 

1 

1 

1 

2 

1 

Thse  vectors  can ’t possibly  be  linearly  independent.  Tell  why.  Next  obtain  a  linearly  independent  subset  of 
these  vectors  which  has  the  same  span  as  these  vectors.  In  other  words,  find  a  basis  for  the  span  of  these 
vectors. 

Exercise  4.10.29  Here  are  some  vectors  in  M4. 


-  1  ' 

'  1  ' 

'  1  ' 

4  ' 

"  1  ' 

-1 

0 

0 

-9 

0 

3 

9 

7 

9 

8 

9 

-6 

9 

8 

1 

1 

1 

4 

1 

Thse  vectors  can ’t possibly  be  linearly  independent.  Tell  why.  Next  obtain  a  linearly  independent  subset  of 
these  vectors  which  has  the  same  span  as  these  vectors.  In  other  words,  find  a  basis  for  the  span  of  these 
vectors. 

Exercise  4.10.30  Here  are  some  vectors  in  M4. 


1  ' 

'  -3  ' 

1  ' 

2  ' 

'  1  ' 

-1 

3 

0 

-9 

0 

-1 

9 

3 

9 

-1 

9 

-2 

9 

0 

1 

-3 

1 

2 

1 

Thse  vectors  can ’t possibly  be  linearly  independent.  Tell  why.  Next  obtain  a  linearly  independent  subset  of 
these  vectors  which  has  the  same  span  as  these  vectors.  In  other  words,  find  a  basis  for  the  span  of  these 
vectors. 
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Exercise  4.10.31  Here  are  some  vectors  in  R4. 


1 ' 

3  ' 

1  ' 

2  " 

1  ' 

b  +  l 

3b  +  3 

b  +  2 

2b-5 

b  +  2 

a 

9 

3a 

9 

2a  +  1 

9 

—5a  —  7 

9 

2  a  +  2 

1 

3 

1 

2 

1 

Thse  vectors  can ’t possibly  be  linearly  independent.  Tell  why.  Next  obtain  a  linearly  independent  subset  of 
these  vectors  which  has  the  same  span  as  these  vectors.  In  other  words,  find  a  basis  for  the  span  of  these 
vectors. 


2 

1 

1 

1 


Exercise  4.10.32  Let  H  —  span 
termine  a  basis. 

Exercise  4.10.33  Let  H  denote  span 
and  determine  a  basis. 

Exercise  4.10.34  Let  H  denote  span 
H  and  determine  a  basis. 

Exercise  4.10.35  Let  H  denote  span 
mension  ofH  and  determine  a  basis. 

Exercise  4.10.36  Let  H  denote  span 
H  and  determine  a  basis. 

Exercise  4.10.37  Let  H  denote  span 


-1 

0 

-1 

-1 


5 

2 

3 

3 


0 

1 

1 

-1 


-2 

1 

1 

'3 


2 

3 

2 

1 


0 

2 

0 

-1 


-1 

-1 

-2 

2 


9 

4 

3 

9 


-1 

1 

-2 

-2 


.  Find  the  dimension  of  H  and  de- 


2 

3 

5 

-5 


0 

1 

2 

-2 


.  Find  the  dimension  ofH 


-33 

15 

12 

-36 


-22 

10 

8 

-24 


8 

15 

6 

3 


3 

6 

2 

1 


4 

6 

6 

3 


8 

15 

6 

3 


-1 

6 

0 

-2 


"  -2  ' 

"  -3  ' 

16 

22 

0 

9 

0 

-6 

-8 

.  Find  the  dimension  of 


‘  -1  ' 

"  -4  ' 

"  -3  ' 

"  -1  ' 

'  -7  ' 

1 

3 

2 

1 

5 

-1 

9 

-2 

9 

-1 

9 

-2 

9 

-3 

-2 

-4 

-2 

-4 

-6 

.  Find  the  di- 


.  Find  the  dimension  of 


Find  the  dimension  ofH 


and  determine  a  basis. 
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Exercise  4.10.38  Let  H  denote  span 


ofH  and  determine  a  basis. 


5  14  38  47  10 

1  3  8  10  2 

1  ’  2  ’  6  ’  7  ’  3 

4  8  24  28  12 


.  Find  the  dimension 


Exercise  4.10.39  Let  H  denote  span 


determine  a  basis. 


6  17  52  18 

13  9  3 

1  ’  2  ’  7  ’  4 

5  10  35  20 


.  Find  the  dimension  of  FI  and 


Exercise  4.10.40  Let  M  =  <  u  =  2  G  M4  :  sin  (u\ )  =  1  \  .  Is  M  a  subspace?  Explain 


Exercise  4.10.41  Let  M  —  <  u  = 


G  H^4  :  ||  Mi  ||  <  4  >  .  Is  M  a  sub  space?  Explain. 


Exercise  4.10.42  Let  M  =  <  u  —  U2  G  M4  :  M/  >  0  for  each  i  =  1,2, 3, 4 


plain. 


.  Is  M  a  subspace?  Ex- 


Exercise  4.10.43  Let  w,w\  be  given  vectors  in  M4  and  define 


M  —  <  u—  G  M  :  w  •  j?  =  0  and  w\  »u  —  0  >  . 

1  M3  I 


Is  M  a  sub  space?  Explain. 


Exercise  4.10.44  Let  w  G  M4  and  let  M  —  <  u  = 


G  M4  :  w  •  u  —  0 


.  Is  M  a  subspace?  Explain. 


Exercise  4.10.45  Let  M  —  lu  = 


G  M4  :  uj,  >  mi 


.  Is  M  a  sub  space?  Explain. 
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Exercise  4.10.46  Let  M 


[ 

Ml 

- 

M2 

.  M  = 

m3 

i 

M4 

G  R4  :  m3  =  mi 


.  Is  M  a  subspace?  Explain. 


Exercise  4.10.47  Consider  the  set  of  vectors  S  given  by 


S  = 


4u  +  v  —  5w 
12m +  6v  —  6w 
4m  +  4v  +  4w 


:  u,v,w 


Is  S  a  subspace  of  R3?  If  so,  explcdn  why,  give  a  basis  for  the  subspace  and  find  its  dimension. 


Exercise  4.10.48  Consider  the  set  of  vectors  S  given  by 


< 

2m  +  6v  +  7w 

1 

—3u  —  9v—  12  w 

:  u,v,w  G  R  / 

2m  +  6v  +  6  w 

u  +  3v  +  3w 

J 

Is  S  a  subspace  of  R4?  If  so,  explcdn  why,  give  a  basis  for  the  subspace  and  find  its  dimension. 


Exercise  4.10.49  Consider  the  set  of  vectors  S  given  by 


S  — 


2u  +  v 

6v  —  3u  +  3w 
3v  —  6u  +  3w 


:  u,v,w 


Is  this  set  of  vectors  a  subspace  of  R3?  If  so,  explcdn  why,  give  a  basis  for  the  sub  space  and  find  its 
dimension. 


Exercise  4.10.50  Consider  the  vectors  of  the  form 


2m  +  v  +  7w 
u  —  2v  +  w 
— 6v  —  6  w 


:  u,v,w 


Is  this  set  of  vectors  a  subspace  of  R3?  If  so,  explain  why,  give  a  basis  for  the  subspace  and  find  its 
dimension. 

Exercise  4.10.51  Consider  the  vectors  of  the  form 

3m  +  v+  llw 
18m  +  6v  +  66w 
28m  +  8v+  lOOw 

Is  this  set  of  vectors  a  subspace  of  R3?  If  so,  explain  why,  give  a  basis  for  the  subspace  and  find  its 
dimension. 
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Exercise  4.10.52  Consider  the  vectors  of  the  form 

3  u  +  v 
2 w  —  4  u 
2 w  —  2v  —  8a 

Is  this  set  of  vectors  a  subspace  o/R3?  If  so,  explain  why,  give  a  basis  for  the  sub  space  and  find  its 
dimension. 

Exercise  4.10.53  Consider  the  set  of  vectors  S  given  by 

{ll  +  V  +  W 

2u  +  2v  +  4  w 
u  +  v  +  w 
0 

Is  S  a  subspace  o/R4?  If  so,  explain  why,  give  a  basis  for  the  subspace  and  find  its  dimension. 

Exercise  4.10.54  Consider  the  set  of  vectors  S  given  by 

v 

—3a  —  3  w 
8a  —  4v  +  4  w 

Is  S  a  subspace  o/R3?  If  so,  explain  why,  give  a  basis  for  the  subspace  and  find  its  dimension. 

Exercise  4.10.55  If  you  have  5  vectors  in  R5  and  the  vectors  are  linearly  independent,  can  it  always  be 
concluded  they  span  R5?  Explain. 

Exercise  4.10.56  If  you  have  6  vectors  in  R5,  is  it  possible  they  are  linearly  independent?  Explain. 

Exercise  4.10.57  Suppose  A  is  an  m  x  a  matrix  and  {w\,-  ■  ■ ,  ny }  is  a  linearly  independent  set  of  vectors 
in  A  (R")  C  R"!.  Now  suppose  Azi  —  w,.  Show  {z\,  ■  ■  *  ,Zk}  is  also  independent. 

Exercise  4.10.58  Suppose  V,W  are  subspaces  of  R".  Let  V  ft  W  be  all  vectors  which  are  in  both  V  and 
W.  Show  that  V  f)  W  is  a  subspace  also. 

Exercise  4.10.59  Suppose  V  and  W  both  have  dimension  equal  to  7  and  they  are  subspaces  o/R10.  What 
are  the  possibilities  for  the  dimension  ofVCiW?  Hint:  Remember  that  a  linear  independent  set  can  be 
extended  to  form  a  basis. 

Exercise  4.10.60  Suppose  V  has  dimension  p  and  W  has  dimension  q  and  they  are  each  contained  in 
a  subspace,  U  which  has  dimension  equal  to  n  where  n  >  max  (p.q) .  What  are  the  possibilities  for  the 
dimension  of  V  ft  W  ?  Hint:  Remember  that  a  linearly  independent  set  can  be  extended  to  form  a  basis. 

Exercise  4.10.61  Suppose  A  is  an  m  x  n  matrix  and  B  is  an  nx  p  matrix.  Show  that 


dim  (ker  (AB) )  <  dim  (ker  (A) )  +  dim  (ker  (B) ) . 
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Hint:  Consider  the  subspace,  B  (Rp)  Piker  (A)  and  suppose  a  basis  for  this  subspace  is  {w\,  ■  ■  ■ ,  }  .  Now 

suppose  {mi,-  •  • ,  ur}  is  a  basis  for  ker  (B) .  Let  {z\,  ■  ■  ■ ,  Zk }  be  such  that  Bzi  —  Wj  and  argue  that 


ker (AB)  C  span{u\ ,ur,zi,---  ,Zk}- 


Exercise  4.10.62  Show  that  if  A  is  an  m  x  n  matrix,  then  ker  (A)  is  a  subspace  of  R". 


Exercise  4.10.63  Find  the  rank  of  the  following  matrix.  Also  find  a  basis  for  the  row  and  column  spaces. 


1  3  0  -2  0  3 

3  9  1  -7  0  8 

13  1-3  1-1 

13-1-1  -2  10 


Exercise  4.10.64  Find  the  rank  of  the  following  matrix.  Also  find  a  basis  for  the  row  and  column  spaces. 


1 

3 

0 

-2 

7 

3 

3 

9 

1 

-7 

23 

8 

1 

3 

1 

-3 

9 

2 

1 

3 

-1 

-1 

5 

4 

Exercise  4.10.65  Find  the  rank  of  the  following  matrix.  Also  find  a  basis  for  the  row  and  column  spaces. 

"1  03  070" 

3  1  10  0  23  0 

1  14  17  0 

1-1  2-2  9  1 


Exercise  4.10.66  Find  the  rank  of  the  following  matrix.  Also  find  a  basis  for  the  row  and  column  spaces. 

"1  0  3" 

3  1  10 

1  1  4 

1  -1  2 


Exercise  4.10.67  Find  the  rank  of  the  following  matrix.  Also  find  a  basis  for  the  row  and  column  spaces. 

"00-10  1  ' 

12  3-2  -18 

12  2-1  -11 
-1  -2  -2  1  11 
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Exercise  4.10.68  Find  the  rank  of  the  following  matrix.  Also  find  a  basis  for  the  row  and  column  spaces. 

"1  0  3  O' 

3  1  10  0 

-1  1-2  1 

1-1  2-2 


Exercise  4.10.69  Find  ker  ( A )  for  the  following  matrices. 


(a)  A 


2  3 
4  6 


(b)  A  — 


1  0  -1 
-1  1  3 

3  2  1 


(c)  A  = 


2  4  0 

3  6-2 
1  2  -2 


( d)  A  = 


2-135 
2  0  12 
6  4-5-6 

0  2-4-6 


4.11  Orthogonality  and  the  Gram  Schmidt  Process 


Outcomes 


A.  Determine  if  a  given  set  is  orthogonal  or  orthonormal. 

B.  Determine  if  a  given  matrix  is  orthogonal. 

C.  Given  a  linearly  independent  set.  use  the  Gram-Schmidt  Process  to  find  corresponding  or¬ 
thogonal  and  orthonormal  sets. 

D.  Find  the  orthogonal  projection  of  a  vector  onto  a  subspace. 

E.  Find  the  least  squares  approximation  for  a  collection  of  points. 
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4.11.1.  Orthogonal  and  Orthonormal  Sets 


In  this  section,  we  examine  what  it  means  for  vectors  (and  sets  of  vectors)  to  be  orthogonal  and  orthonor¬ 
mal.  First,  it  is  necessary  to  review  some  important  concepts.  You  may  recall  the  definitions  for  the  span 
of  a  set  of  vectors  and  a  linear  independent  set  of  vectors.  We  include  the  definitions  and  examples  here 
for  convenience. 


Definition  4.116:  Span  of  a  Set  of  Vectors  and  Subspace 


The  collection  of  all  linear  combinations  of  a  set  of  vectors  {«],-•• ,  Vp}  in  R'!  is  known  as  the  span 
of  these  vectors  and  is  written  as  span{Ui ,  •  •  ■  ,  M/J. 

We  call  a  collection  of  the  form  span{Ui,-  ■  ■ ,  i4}  a  subspace  ofW1. 


Consider  the  following  example. 


Example  4.117:  Span  of  Vectors 

Describe  the  span  of  the  vectors  u  =  [  1 

1  0  ] T  and  v  =  [  3  2  0  ]7g13. 

T 

Solution.  You  can  see  that  any  linear  combination  of  the  vectors  u  and  v  yields  a  vector  [  x  y  0  ]  in 
the  YF-plane. 


Moreover  every  vector  in  the  YF-planc  is  in  fact  such  a  linear  combination  of  the  vectors 
That’s  because 


X 

"  1  ' 

'  3  " 

y 

=  (-2x  +  3y) 

1 

+  (x-y) 

2 

0 

0 

0 

u  and  v. 


Thus  span{w, v}  is  precisely  the  .XT-plane. 


* 


The  span  of  a  set  of  a  vectors  in  R"  is  what  we  call  a  subspace  of  ME  A  subspace  W  is  characterized 
by  the  feature  that  any  linear  combination  of  vectors  of  W  is  again  a  vector  contained  in  W . 

Another  important  property  of  sets  of  vectors  is  called  linear  independence. 


Definition  4.118:  Linearly  Independent  Set  of  Vectors 


A  set  of  non-zero  vectors  [u\,  -  ■  ■ ,  u^}  in  R"  is  said  to  be  linearly  independent  if  no  vector  in  that 
set  is  in  the  span  of  the  other  vectors  of  that  set. 


Here  is  an  example. 
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Solution.  We  already  verified  in  Example  4. 1 17  that  spanjw,  v}  is  the  AT-plane.  Since  w  is  clearly  also  in 
the  AT-planc,  then  the  set  {u,  v,  w\  is  not  linearly  independent.  4k 

In  terms  of  spanning,  a  set  of  vectors  is  linearly  independent  if  it  does  not  contain  unnecessary  vectors. 
In  the  previous  example  you  can  see  that  the  vector  w  does  not  help  to  span  any  new  vector  not  already  in 
the  span  of  the  other  two  vectors.  However  you  can  verify  that  the  set  {«,  v}  is  linearly  independent,  since 
you  will  not  get  the  AT -plane  as  the  span  of  a  single  vector. 

We  can  also  determine  if  a  set  of  vectors  is  linearly  independent  by  examining  linear  combinations.  A 
set  of  vectors  is  linearly  independent  if  and  only  if  whenever  a  linear  combination  of  these  vectors  equals 
zero,  it  follows  that  all  the  coefficients  equal  zero.  It  is  a  good  exercise  to  verify  this  equivalence,  and  this 
latter  condition  is  often  used  as  the  (equivalent)  definition  of  linear  independence. 

If  a  subspace  is  spanned  by  a  linearly  independent  set  of  vectors,  then  we  say  that  it  is  a  basis  for  the 
subspace. 


Thus  the  set  of  vectors  {«,  v}  from  Example  4.1 19  is  a  basis  for  AT-planc  in  R3  since  it  is  both  linearly 
independent  and  spans  the  AT-plane. 

Recall  from  the  properties  of  the  dot  product  of  vectors  that  two  vectors  u  and  v  are  orthogonal  if 
u  •  v  =  0.  Suppose  a  vector  is  orthogonal  to  a  spanning  set  of  R".  What  can  be  said  about  such  a  vector? 
This  is  the  discussion  in  the  following  example. 


Example  4.121:  Orthogonal  Vector  to  a  Spanning  Set 


Let  {xi,X2,  ■  ■  ■  ,Xk}  G  R'7  and  suppose  R'!  =  span{x\  ,33, . . .  ,%}.  Furthermore,  suppose  that  there 
exists  a  vector  u  G  R"  for  which  u  •  xj  =  0  for  all  j,  1  <j<k.  What  type  of  vector  is  u? 


Solution.  Write  u  —  t\x\  + t2x2  H - h  fiAt  for  some  t\,  t2,  ■  ■  ■ , G  R  (this  is  possible  because  x\  ,xi, . . .  ,x^ 

span  R'7). 

Then 


=  U  •  (hXi  +  t2X2  H - h  tkXk) 

=  u  •  (tixi )  +  u  •  (t2x2)  H - f  u  •  {tkXk) 

—  t\(u  +  x\)  Tt2(w#.r2)  -| - 1- tk(u*Xk) 

—  ti(0)  +  t2(0)  4 - f  tjfc(0)  =  0. 

Since  \\u\\2  —  0,  ||m||  =0.  We  know  that  ||w||  =  0  if  and  only  if  u  —  0„.  Therefore,  u  —  0n.  In  conclusion, 
the  only  vector  orthogonal  to  every  vector  of  a  spanning  set  of  R'7  is  the  zero  vector.  4k 
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We  can  now  discuss  what  is  meant  by  an  orthogonal  set  of  vectors. 


If  we  have  an  orthogonal  set  of  vectors  and  normalize  each  vector  so  they  have  length  1 ,  the  resulting 
set  is  called  an  orthonormal  set  of  vectors.  They  can  be  described  as  follows. 


Note  that  all  orthonormal  sets  are  orthogonal,  but  the  reverse  is  not  necessarily  true  since  the  vectors 
may  not  be  normalized.  In  order  to  normalize  the  vectors,  we  simply  need  divide  each  one  by  its  length. 


r  i 

Definition  4.124:  Normalizing  an  Orthogonal  Set 

Normalizing  an  orthogonal  set  is  the  process  of  turning  an  orthogonal  (but  not  orthonormal)  set  into 

an  orthonormal  set.  If  {u. i ,  u2, . . 

.  ,Uk}  is  an  orthogonal  subset  ofRn,  then 

is  an  orthonormal  set. 

f  1  .  1  -  1-1 

<  M_  \.u i,  u2,...,  uk  > 

UIMill  II w2 II  IKII  J 

We  illustrate  this  concept  in  the  following  example. 


Example  4.125:  Orthonormal  Set 


Consider  the  set  of  vectors  given  by 

{ui,u2j  = 

Show  that  it  is  an  orthogonal  set  of  vectors  but  not  an  orthonormal  one.  Find  the  corresponding 
orthonormal  set. 


Solution.  One  easily  verifies  that  u\  •  u2  —  0  and  {u\ ,  u2}  is  an  orthogonal  set  of  vectors.  On  the  other 
hand  one  can  compute  that  ||«i  ||  =  1 1 1 1  =  \/2  ^  1  and  thus  it  is  not  an  orthonormal  set. 
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Thus  to  find  a  corresponding  orthonormal  set,  we  simply  need  to  normalize  each  vector.  We  will  write 
{ w  i ,  w2  }  for  the  corresponding  orthonormal  set.  Then, 


1 


w  i  = 


\\ui 

1 

71 


~ui 

i 

i 


V2 

J_ 

V2 


Similarly, 


w2  = 


1 


1 1  M2 1 1 

— r _i 

72L  1 

_  j_ 

1 


V2 


Therefore  the  corresponding  orthonormal  set  is 


{wi,w2} 


i 

V2 

J_ 

V2 


J_ 

V2 

J_ 


You  can  verify  that  this  set  is  orthogonal.  4* 

Consider  an  orthogonal  set  of  vectors  in  W\  written  { vv  1 ,  •  •  • ,  vfy }  with  k  <  n.  The  span  of  these  vectors 
is  a  subspace  W  of  W\  If  we  could  show  that  this  orthogonal  set  is  also  linearly  independent,  we  would 
have  a  basis  of  W.  We  will  show  this  in  the  next  theorem. 


Theorem  4.126:  Orthogonal  Basis  of  a  Subspace 


Let  {wi,  w2,  ■  ■  ■ ,  Wk}  be  an  orthonormal  set  of  vectors  in  M".  Then  this  set  is  linearly  independent 
and  forms  a  basis  for  the  subspace  W  =  span{  w  \ ,  w2,  ■  ■  ■  ,  w^}. 


Proof.  To  show  it  is  a  linearly  independent  set,  suppose  a  linear  combination  of  these  vectors  equals  0, 
such  as: 

ai>vi  +  a2w2  H - 1-  —  0 ,a7-  G  R 

We  need  to  show  that  all  a,  =  0.  To  do  so,  take  the  dot  product  of  each  side  of  the  above  equation  with  the 
vector  Wi  and  obtain  the  following. 


=  wi*  0 


Wi  •  ( a\w\  +  a2w2  -\ - h  akwk ) 
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ai(wj»w\)  +a2(wj»w2)  H - \-ak{wi»wk)  =  0 

Now  since  the  set  is  orthogonal,  w-L  •  wm  —  0  for  all  m  /  i,  so  we  have: 

<31  (0)  H - b  Cli(Wi»Wi)-\ - b  <3/t(0)  =  0 

a;||vv;||2  =  0 

Since  the  set  is  orthogonal,  we  know  that  1 1  vv,  1 1 2  ^  0.  It  follows  that  a;-  =  0.  Since  the  aj  was  chosen 
arbitrarily,  the  set  {wi,W2,  ■  ■  ■ ,  by  }  is  linearly  independent. 

Finally  since  W  —  spanjvvj ,  W2,  •  •  • ,  vvy  },  the  set  of  vectors  also  spans  W  and  therefore  forms  a  basis  of 
W. 

If  an  orthogonal  set  is  a  basis  for  a  subspace,  we  call  this  an  orthogonal  basis.  Similarly,  if  an  orthonor¬ 
mal  set  is  a  basis,  we  call  this  an  orthonormal  basis. 

We  conclude  this  section  with  a  discussion  of  Fourier  expansions.  Given  any  orthogonal  basis  B  of  M" 
and  an  arbitrary  vector  x  G  R”,  how  do  we  express  rasa  linear  combination  of  vectors  in  B1  The  solution 
is  Fourier  expansion. 


Consider  the  following  example. 


Example  4.128:  Fourier  Expansion 


1  " 

'  0  ' 

5  ' 

'  1  ' 

Let  u\  — 

-1 

,U2  = 

2 

,  and  ut,  — 

1 

,  and  letx  — 

1 

2 

1 

-2 

1 

Then  B  —  { w  | ,  }  is  an  orthogonal  basis  o/'E3. 

Compute  the  Fourier  expansion  ofx,  thus  writing  x  as  a  linear  combination  of  the  vectors  ofB. 


Solution.  Since  B  is  a  basis  (verify!)  there  is  a  unique  way  to  express  x  as  a  linear  combination  of  the 
vectors  of  B.  Moreover  since  B  is  an  orthogonal  basis  (verify!),  then  this  can  be  done  by  computing  the 
Fourier  expansion  of  x. 
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That  is: 


We  readily  compute: 


x  — 


ll2  + 


X»  M3 


M3. 


Therefore, 


x»  U\ 


2  X»U2 

6’  1 1 r?2 1 1 2 


3  , 

and  n 

5  «3  r 


4 

30' 


■  1  ■ 

1 

1  ' 

3 

'  0  ' 

2 

5  ' 

1 

-1 

+  — 

2 

+  ~rz 

1 

1 

”  3 

2 

5 

1 

15 

-2 

* 


4.11.2.  Orthogonal  Matrices 


Recall  that  the  process  to  find  the  inverse  of  a  matrix  was  often  cumbersome.  In  contrast,  it  was  very  easy 
to  take  the  transpose  of  a  matrix.  Luckily  for  some  special  matrices,  the  transpose  equals  the  inverse.  When 
an  nx  n  matrix  has  all  real  entries  and  its  transpose  equals  its  inverse,  the  matrix  is  called  an  orthogonal 
matrix. 

The  precise  definition  is  as  follows. 


Definition  4.129:  Orthogonal  Matrices 


A  real  n  x  n  matrix  U  is  called  an  orthogonal  matrix  ifUUT  —  UTU  —  I. 


Note  since  U  is  assumed  to  be  a  square  matrix,  it  suffices  to  verify  only  one  of  these  equalities  UUT  —  I 
or  UTU  =  I  holds  to  guarantee  that  U1  is  the  inverse  of  U. 

Consider  the  following  example. 


Solution.  All  we  need  to  do  is  verify  (one  of  the  equations  from)  the  requirements  of  Definition  4.129. 


"  1  1  ' 

'  1  1  " 

y/2  V2 

y/2  V2 

'10' 

1  _  1 

1  _  1 

— 

0  1 

y/2  V2 

y/2  V2 
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Since  UUT  =  I,  this  matrix  is  orthogonal.  4 

Here  is  another  example. 


Example  4.131:  Orthogonal  Matrix 

LetU  = 

'  1  0  O' 

0  0-1 
0-1  0 

.  Is  U  orthogonal? 

Solution.  Again  the  answer  is  yes  and  this  can  be  verified  simply  by  showing  that  UTU  —  I: 


UTU 


1  0  0 

0  0-1 
0-1  0 

1  0  0 

0  0-1 
0-1  0 

10  0' 

0  1  0 

0  0  1 


1  0  0 
0  0-1 
0-1  0 

1  0  o' 

0  0-1 
0-1  0 


When  we  say  that  U  is  orthogonal,  we  are  saying  that  UUT  —  /,  meaning  that 


UijUjk  —  UijUk  j  —  8jk 

j  j 


where  <5,y  is  the  Kronecker  symbol  defined  by 


1  if  i  =  j 

0  if  i±  J 


* 


In  words,  the  product  of  the  ith  row  of  U  with  the  k'h  row  gives  1  if  i  —  k  and  0  if  i  ^  k.  The  same  is 
true  of  the  columns  because  UTU  —  I  also.  Therefore, 

UjiUjk  =  8ik 

j  j 

which  says  that  the  product  of  one  column  with  another  column  gives  1  if  the  two  columns  are  the  same 
and  0  if  the  two  columns  are  different. 

More  succinctly,  this  states  that  if  t?i, -  -  •  ,un  are  the  columns  of  U,  an  orthogonal  matrix,  then 


Uj  •  Uj  =  8jj 


I  if/  j 

0  if  M  j 
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We  will  say  that  the  columns  form  an  orthonormal  set  of  vectors,  and  similarly  for  the  rows.  Thus 
a  matrix  is  orthogonal  if  its  rows  (or  columns)  form  an  orthonormal  set  of  vectors.  Notice  that  the 
convention  is  to  call  such  a  matrix  orthogonal  rather  than  orthonormal  (although  this  may  make  more 
sense!). 


Proposition  4.132:  Orthonormal  Basis 


The  rows  of  an  nx  n  orthogonal  matrix  form  an  orthonormal  basis  of  It'1.  Further,  any  orthonormal 
basis  of  R' 1  can  be  used  to  construct  an  n  x  n  orthogonal  matrix. 


Proof.  Recall  from  Theorem  4.126  that  an  orthonormal  set  is  linearly  independent  and  forms  a  basis  for 
its  span.  Since  the  rows  of  an  n  x  n  orthogonal  matrix  form  an  orthonormal  set,  they  must  be  linearly  inde¬ 
pendent.  Now  we  have  n  linearly  independent  vectors,  and  it  follows  that  their  span  equals  R” .  Therefore 
these  vectors  form  an  orthonormal  basis  for  R”. 

Suppose  now  that  we  have  an  orthonormal  basis  for  R".  Since  the  basis  will  contain  n  vectors,  these 
can  be  used  to  construct  an  nx  n  matrix,  with  each  vector  becoming  a  row.  Therefore  the  matrix  is 
composed  of  orthonormal  rows,  which  by  our  above  discussion,  means  that  the  matrix  is  orthogonal.  Note 
we  could  also  have  construct  a  matrix  with  each  vector  becoming  a  column  instead,  and  this  would  again 
be  an  orthogonal  matrix.  In  fact  this  is  simply  the  transpose  of  the  previous  matrix.  4k 

Consider  the  following  proposition. 


Proposition  4.133:  Determinant  of  Orthogonal  Matrices 


Suppose  U  is  an  orthogonal  matrix.  Then  det  (U)  =  ±1. 


Proof.  This  result  follows  from  the  properties  of  determinants.  Recall  that  for  any  matrix  A,  dct(A) 7  = 
det(A).  Now  if  U  is  orthogonal,  then: 

(det  (7/))2  =  det  ( UT )  det  (U)  =  det  (UTU)  =  det  (7)  =  1 

Therefore  (det(f/))2  =  1  and  it  follows  that  det  (77)  =  ±1.  4k 

Orthogonal  matrices  are  divided  into  two  classes,  proper  and  improper.  The  proper  orthogonal  ma¬ 
trices  are  those  whose  determinant  equals  1  and  the  improper  ones  are  those  whose  determinant  equals 
—  1 .  The  reason  for  the  distinction  is  that  the  improper  orthogonal  matrices  are  sometimes  considered  to 
have  no  physical  significance.  These  matrices  cause  a  change  in  orientation  which  would  correspond  to 
material  passing  through  itself  in  a  non  physical  manner.  Thus  in  considering  which  coordinate  systems 
must  be  considered  in  certain  applications,  you  only  need  to  consider  those  which  are  related  by  a  proper 
orthogonal  transformation.  Geometrically,  the  linear  transformations  determined  by  the  proper  orthogonal 
matrices  correspond  to  the  composition  of  rotations. 

We  conclude  this  section  with  two  useful  properties  of  orthogonal  matrices. 


Example  4.134:  Product  and  Inverse  of  Orthogonal  Matrices 


Suppose  A  and  B  are  orthogonal  matrices.  Then  AB  and  A  1  both  exist  and  are  orthogonal. 
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Solution.  First  we  examine  the  product  AB. 

(. AB)(BtAt )  =A(BBt)At  =AAt  —  I 

Since  AB  is  square,  BtAt  —  ( AB)t  is  the  inverse  of  AB ,  so  AB  is  invertible,  and  (AB)  1  =  (AB)t  There¬ 
fore,  AB  is  orthogonal. 

Next  we  show  that  A"1  —  A7  is  also  orthogonal. 

(A^1)-1  =  A  =  (A7)7  —  (A-1)7 

Therefore  A'1  is  also  orthogonal.  4k 

4.11.3.  Gram-Schmidt  Process 


The  Gram-Schmidt  process  is  an  algorithm  to  transform  a  set  of  vectors  into  an  orthonormal  set  spanning 
the  same  subspace,  that  is  generating  the  same  collection  of  linear  combinations  (see  Definition  9.12). 

The  goal  of  the  Gram-Schmidt  process  is  to  take  a  linearly  independent  set  of  vectors  and  transform  it 
into  an  orthonormal  set  with  the  same  span.  The  first  objective  is  to  construct  an  orthogonal  set  of  vectors 
with  the  same  span,  since  from  there  an  orthonormal  set  can  be  obtained  by  simply  dividing  each  vector 
by  its  length. 


Proof.  The  full  proof  of  this  algorithm  is  beyond  this  material,  however  here  is  an  indication  of  the  argu¬ 


ments. 
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To  show  that  {vi,  •  •  •  ,vn}  is  an  orthogonal  set,  let 


a2  = 


U-2  •  Vi 


Vl 


then: 


vi  •  v2  =  vi  •  {u2-a2v\) 

=  Vl  »u2  —  a2(vi  «vi 

«2#Vl  1 1  | . 2 


=  Vi  •  W2  -  ' 


yl 


=  (vi  •  M2)  -  (i<2#vi)  =  0 

Now  that  you  have  shown  that  {vi ,  V2}  is  orthogonal,  use  the  same  method  as  above  to  show  that  { vi ,  V2>  V3  } 
is  also  orthogonal,  and  so  on. 

Then  in  a  similar  fashion  you  show  that  span{«i,  •  •  •  ,un}  —  spanjvi,  •  •  • ,  v„}. 

Vi 

Finally  defining  w,  =  yp^-jj  for  /  =  1,  •  •  •  ,n  does  not  affect  orthogonality  and  yields  vectors  of  length  1, 

hence  an  orthonormal  set.  You  can  also  observe  that  it  does  not  affect  the  span  either  and  the  proof  would 
be  complete.  4 

Consider  the  following  example. 


Solution.  We  already  remarked  that  the  set  of  vectors  in  {ni ,  u2}  is  linearly  independent,  so  we  can  proceed 
with  the  Gram-Schmidt  algorithm: 


Vl  =  U\ 


1 

1 

0 


V2 


U2  - 


(  U2  •  Vl \ 

Viiviii2; 


Vl 


3 

2 

0 


5 

2 


1 

1 

0 


1 

2 

1 

2 


0 
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Now  to  normalize  simply  let 


w2  = 


V2 

J_ 

V2 

0 


y/2 

J_ 

V2 

0 


You  can  verify  that  {w\,w2}  is  an  orthonormal  set  of  vectors  having  the  same  span  as  {u\,u2},  namely 
the  XT-plane.  4k 

In  this  example,  we  began  with  a  linearly  independent  set  and  found  an  orthonormal  set  of  vectors 
which  had  the  same  span.  It  turns  out  that  if  we  start  with  a  basis  of  a  subspace  and  apply  the  Gram- 
Schmidt  algorithm,  the  result  will  be  an  orthogonal  basis  of  the  same  subspace.  We  examine  this  in  the 
following  example. 


Solution.  First  f\  =x\. 
Next, 


"  1 ' 

'  1  ' 

'  0  " 

0 

2 

0 

0 

1 

“  2 

1 

0 

1 

0 

1 

Finally, 


'  1 ' 
1 

1 

'  1  ' 
0 

0 

i 

O  O 

1/2  ' 
1 

0 

~  2 

1 

~T 

0 

-1/2 

0 

0 

i 

0 

f 

'  1  ' 

'  0  ' 

1/2 

) 

1 

0 

0 

1 

1 

1 

? 

0 

? 

-1/2 

i 

0 

1 

0 

1 

Therefore, 
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is  an  orthogonal  basis  of  U.  However,  it  is  sometimes  more  convenient  to  deal  with  vectors  having  integer 
entries,  in  which  case  we  take 


* 


4.11.4.  Orthogonal  Projections 


An  important  use  of  the  Gram-Schmidt  Process  is  in  orthogonal  projections,  the  focus  of  this  section. 

You  may  recall  that  a  subspace  of  R”  is  a  set  of  vectors  which  contains  the  zero  vector,  and  is  closed 
under  addition  and  scalar  multiplication.  Let’s  call  such  a  subspace  Vk.  In  particular,  a  plane  in  R”  which 
contains  the  origin,  (0,0, •  •  •  ,0),  is  a  subspace  of  R'1. 

Suppose  a  point  Y  in  R'7  is  not  contained  in  W,  then  what  point  Z  in  W  is  closest  to  Y?  Using  the 
Gram-Schmidt  Process,  we  can  find  such  a  point.  Let  y,z  represent  the  position  vectors  of  the  points  Y  and 
Z  respectively,  with  y  —  z  representing  the  vector  connecting  the  two  points  Y  and  Z.  It  will  follow  that  if 
Z  is  the  point  on  W  closest  to  Y,  then  y  —  z  will  be  perpendicular  to  W  (can  you  see  why?);  in  other  words, 
y  —  z  is  orthogonal  to  W  (and  to  every  vector  contained  in  W)  as  in  the  following  diagram. 


Y 


The  vector  z  is  called  the  orthogonal  projection  of  y  on  W.  The  definition  is  given  as  follows. 


Therefore,  in  order  to  find  the  orthogonal  projection,  we  must  first  find  an  orthogonal  basis  for  the 
subspace.  Note  that  one  could  use  an  orthonormal  basis,  but  it  is  not  necessary  in  this  case  since  as  you 
can  see  above  the  normalization  of  each  vector  is  included  in  the  formula  for  the  projection. 

Before  we  explore  this  further  through  an  example,  we  show  that  the  orthogonal  projection  does  indeed 
yield  a  point  Z  (the  point  whose  position  vector  is  the  vector  z  above)  which  is  the  point  of  W  closest  to  Y. 


4.11.  Orthogonality  and  the  Gram  Schmidt  Process  245 


Theorem  4.139:  Approximation  Theorem 


Let  W  be  a  subspace  of  M”  and  Y  any  point  in  W\  Let  Z  be  the  point  whose  position  vector  is  the 
orthogonal  projection  ofY  ontoW. 

Then,  Z  is  the  point  in  W  closest  to  Y. 


Proof.  First  Z  is  certainly  a  point  in  W  since  it  is  in  the  span  of  a  basis  of  W. 

To  show  that  Z  is  the  point  in  W  closest  to  Y,  we  wish  to  show  that  |y  —  z\  \  >  |y  —  z\  for  all  z\  ^  z  G  W . 
We  begin  by  writing  y  —  zi  —  (y  —  z)  +  (z  —  zi).  Now,  the  vector  y  —  z  is  orthogonal  to  W,  and  z  —  zi  is 
contained  in  W.  Therefore  these  vectors  are  orthogonal  to  each  other.  By  the  Pythagorean  Theorem,  we 
have  that 

||y-zi||2  =  ||y-z||2  +  ||z-zi||2  >  ||y-z||2 
This  follows  because  z  ^  z\  so  1 1 z  —  zi  \ \ 2  >  0. 

Hence,  ||y  —  z\  ||2  >  ||y  —  z\\2-  Taking  the  square  root  of  each  side,  we  obtain  the  desired  result.  4 
Consider  the  following  example. 


Example  4.140:  Orthogonal  Projection 


Let  W  be  the  plane  through  the  origin  given  by  the  equation  x  —  2y  +  z  —  0. 
Find  the  point  in  W  closest  to  the  point  Y  —  (1,0,3). 


Solution.  We  must  first  find  an  orthogonal  basis  for  W.  Notice  that  W  is  characterized  by  all  points  ( a,b,c ) 
where  c  —  2b  —  a.  In  other  words, 


a 

1 

'  0  ' 

b 

—  a 

0 

+  b 

1 

2b  — a 

-1 

2 

a,kl 


We  can  thus  write  W  as 


W 


span{»i,M2} 


Notice  that  this  span  is  a  basis  of  W  as  it  is  linearly  independent.  We  will  use  the  Gram-Schmidt 
Process  to  convert  this  to  an  orthogonal  basis,  {wi,W2}-  In  this  case,  as  we  remarked  it  is  only  necessary 
to  find  an  orthogonal  basis,  and  it  is  not  required  that  it  be  orthonormal. 


W\  =  U\ 


1 

0 

-1 


w2 


U2  ~ 


{  U2  •  Wl  \ 

VIM2; 


W 1 
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n 


o 

1 

2 

0 

1 

2 

1 

1 

1 


+ 


1 

0 

-1 


1 

0 

-1 


Therefore  an  orthogonal  basis  of  W  is 

{wi,w2} 


We  can  now  use  this  basis  to  find  the  orthogonal  projection  of  the  point  Y  —  (1,0,3)  on  the  subspace  W. 

i  i 

.  Using  Definition  4.138,  we  compute  the  projection 


We  will  write  the  position  vector  y  of  Y  as  _y  = 
as  follows: 


0 

3 


z  =  Projw  (50 


3 

4 
3 
7 
3 


Therefore  the  pointZ  on  W  closest  to  the  point  (1,0,3)  is  (3,3,3)- 

* 

Recall  that  the  vector  y  —  z  is  perpendicular  (orthogonal)  to  all  the  vectors  contained  in  the  plane  W . 
Using  a  basis  for  W,  we  can  in  fact  find  all  such  vectors  which  are  perpendicular  to  W.  We  call  this  set  of 
vectors  the  orthogonal  complement  of  W  and  denote  it  WL . 
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The  orthogonal  complement  is  defined  as  the  set  of  all  vectors  which  are  orthogonal  to  all  vectors  in 
the  original  subspace.  It  turns  out  that  it  is  sufficient  that  the  vectors  in  the  orthogonal  complement  be 
orthogonal  to  a  spanning  set  of  the  original  space. 


Proposition  4.142:  Orthogonal  to  Spanning  Set 


Let  W  be  a  subspace  of  R77  such  that  W  —  span  {w\,  W2,-  •  •  ,wm}.  Then  W~  is  the  set  of  all  vectors 
which  are  orthogonal  to  each  wt  in  the  spanning  set. 


The  following  proposition  demonstrates  that  the  orthogonal  complement  of  a  subspace  is  itself  a  sub¬ 
space. 


Proposition  4.143:  The  Orthogonal  Complement 


LetW  be  a  subspace  ofR77.  Then  the  orthogonal  complement  W  is  also  a  subspace  ofR77. 


Consider  the  following  proposition. 


Proof.  Here,  0  is  the  zero  vector  of  R77.  Since  x»  0  =  0  for  all  x  G  R77,  R77  C  {0}-1.  Since  {0}1-  C  R77,  the 
equality  follows,  i.e.,  {0}-*-  =  R”. 

Again,  since  x»0  =  0  for  all  x  G  R”,  0  G  (R'7)-1,  so  {0}  C  (R'7)-1.  Suppose  x  G  R'\  x^O.  Since 
x»x=  \\x\  |2  and  jc  7^  0,  x»x  ^  0,  so  x  0  (R77)-1.  Therefore  (R77)-1  C  {0},  and  thus  (R77)-1-  =  {0}.  4 

In  the  next  example,  we  will  look  at  how  to  find  W^. 


Example  4.145:  Orthogonal  Complement 


Let  W  be  the  plane  through  the  origin  given  by  the  equation  x  —  2y  +  z  —  0.  Find  a  basis  for  the 
orthogonal  complement  ofW. 


Solution. 

From  Example  4.140  we  know  that  we  can  write  W  as 


W  —  span  {i/i,  M2}  =  span 
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In  order  to  find  W  ,  we  need  to  find  all  x  which  are  orthogonal  to  every  vector  in  this  span. 
Letx  = 


xi 

*2 

*3 


.  In  order  to  satisfy  x*  U\  —  0,  the  following  equation  must  hold. 

X\  —  X3  —  0 


In  order  to  satisfy  x»U2  =  0,  the  following  equation  must  hold. 

X2  +  2a'3  =  o 

Both  of  these  equations  must  be  satisfied,  so  we  have  the  following  system  of  equations. 

X]  —X3—O 
A'2  +  2a3  =  0 

To  solve,  set  up  the  augmented  matrix. 


'  1  0 

-1 

0  ' 

0  1 

2 

0 

r 

1  ■ 

\  ( 

1  ' 

I 

1 

-2 

> ,  and  hence  < 

-2 

\ 

l 

1 

J  l 

1 

1 

Using  Gaussian  Elimination,  we  find  that  W ^  =  span 
for  W1. 

The  following  results  summarize  the  important  properties  of  the  orthogonal  projection. 


Theorem  4.146:  Orthogonal  Projection 


Let  W  be  a  subspace  of  M'!,  Y  be  any  point  in  W\  and  let  Z  be  the  point  in  W  closest  to  Y.  Then, 

1 .  The  position  vector  z  of  the  point  Z  is  given  byz  —  projw  (v) 

2.  z  G  W  and  y  —  z£  W1- 

3.  \Y  —  Z\  <  \Y  —  Z\\  for  all  Zi  ^Z^W 


Consider  the  following  example  of  this  concept. 


Example  4.147:  Find  a  Vector  Closest  to  a  Given  Vector 


Let 

'  1  ‘ 
0 

'  1  ‘ 
0 

'  1  ‘ 
1 

,  andv  — 

4 

3 

x\  = 

1 

,X2  = 

1 

,X3  = 

0 

-2 

0 

1 

0 

5 

We  want  to  find  the  vector  in  W  =  ,span{A|  ,A2,T3 }  closest  toy. 
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Solution.  We  will  first  use  the  Gram-Schmidt  Process  to  construct  the  orthogonal  basis,  B,  of  W: 


By  Theorem  4.146, 


Pr°j  t/(v) 


"  1 ' 
0 

5 

i 

O  O 

_ i 

12 

1  ' 

2 

1 - 

CO 

1 

+  T 

0 

vo 

+ 

-1 

— 

-1 

0 

l 

0 

5  _ 

is  the  vector  in  U  closest  to  y. 
Consider  the  next  example. 


* 


Solution.  From  Theorem  4.139,  the  point  Z  in  W  closest  to  Y  is  given  by  z  —  projlv  (y). 
Notice  that  since  the  above  vectors  already  give  an  orthogonal  basis  for  W,  we  have: 


z 


projiv  (y) 
y  •  w\ 

II vvi  ||2 


W\  + 


yw2 

1 1 W2 1 1 2 


W2 


2 

2 

2 

4 


Therefore  the  point  in  W  closest  to  Y  is  Z  =  (2, 2, 2, 4). 
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Now,  we  need  to  write  y  as  the  sum  of  a  vector  in  W  and  a  vector  in  W^.  This  can  easily  be  done  as 
follows: 

y  =  z  +  {y-z) 


since  z  is  in  W  and  as  we  have  seen  y  —  z  is  in  W -L. 
The  vector  y  —  z  is  given  by 


"  1  ' 

'  2  ' 

"  -1  ' 

2 

2 

0 

3 

2 

1 

4 

4 

0 

Therefore,  we  can  write  y  as 


"  1  ' 

"  2  ' 

‘  -1  ' 

2 

2 

0 

3 

— 

2 

+ 

1 

4 

4 

0 

* 


Example  4.149:  Point  in  a  Plane  Closest  to  a  Given  Point 


Find  the  point  Z  in  the  plane  3x  +  y  —  2z  —  0  that  is  closest  to  the  point  Y  —  ( 1 , 1 , 1 ) . 


Solution.  The  solution  will  proceed  as  follows. 

1 .  Find  a  basis  X  of  the  subspace  W  of  M3  defined  by  the  equation  3x  +  y  —  2z  —  0. 

2.  Orthogonalize  the  basis  X  to  get  an  orthogonal  basis  B  of  W. 

3.  Find  the  projection  on  W  of  the  position  vector  of  the  point  Y. 

We  now  begin  the  solution. 


1.  3x  +  y  —  2z  =  0  is  a  system  of  one  equation  in  three  variables.  Putting  the  augmented  matrix  in 
reduced  row-echelon  form: 

[3  1  — 2  |  0  ]  — f  [  1  1  — |  |  0  ] 
gives  general  solution  x  —  +  ^t,  y  —  s,  z  — t  for  any  s,t  G  M.  Then 


( 

l  - 

'  2  ' 

j 

I 

3 

3 

W  —  span  < 

1 

? 

0 

V 

l 

0  . 

_  1  _ 

1 

.  Then  X  is  linearly  independent  and  span(X)  =  W,  so  X  is  a  basis  of 


( 

"  -1  ' 

'  2  ' 

Let  X  =  { 

3 

? 

0 

\ 

0 

3 
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2.  Use  the  Gram-Schmidt  Process  to  get  an  orthogonal  basis  of  W : 

fl  = 


"  -1  ' 

'  2  ' 

9 

‘  -l ' 

1 

9  ' 

3 

and  f2  = 

0 

— z, 

3 

3 

0 

3 

"To 

0 

“  5 

15 

Therefore  B  — 


-1 

3 

0 


3 

1 

5 


is  an  orthogonal  basis  of  W. 
3.  To  find  the  point  Z  on  W  closest  to  Y  =  (1,1,1),  compute 


projiv 


'  1  ' 

2 

"  -l ' 

9 

'  3  ' 

1 

3 

~\ - 

1 

1 

To 

0 

35 

5 

1 

7 


4 

6 

9 


Therefore,  Z  =  (j,  |,  |) . 


4.11.5.  Least  Squares  Approximation 


It  should  not  be  surprising  to  hear  that  many  problems  do  not  have  a  perfect  solution,  and  in  these  cases  the 
objective  is  always  to  try  to  do  the  best  possible.  For  example  what  does  one  do  if  there  are  no  solutions  to 
a  system  of  linear  equations  Ax  —  bl  It  turns  out  that  what  we  do  is  find  x  such  that  Ax  is  as  close  to  b  as 
possible.  A  very  important  technique  that  follows  from  orthogonal  projections  is  that  of  the  least  square 
approximation,  and  allows  us  to  do  exactly  that. 

We  begin  with  a  lemma. 

Recall  that  we  can  form  the  image  of  an  m  x  n  matrix  A  by  im(A)  ==  {Ax  :  x  E  M"}.  Rephrasing 
Theorem  4.146  using  the  subspace  W  =  im  (A)  gives  the  equivalence  of  an  orthogonality  condition  with 
a  minimization  condition.  The  following  picture  illustrates  this  orthogonality  condition  and  geometric 
meaning  of  this  theorem. 
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We  note  a  simple  but  useful  observation. 


Proof.  This  follows  from  the  definitions: 

A**y  =  Y<ai  JxJyi  =  1 1xia  J‘yi  =  y*A'y 
ij  iJ 

* 

The  next  corollary  gives  the  technique  of  least  squares. 


Corollary  4.152:  Least  Squares  and  Normal  Equation 


A  specific  value  ofx  which  solves  the  problem  of  Theorem  4. 150  is  obtained  by  solving  the  equation 

AT  Ax  =  Ary 

Furthermore,  there  always  exists  a  solution  to  this  system  of  equations. 


Proof.  For  x  the  minimizer  of  Theorem  4.150,  (y  —  Ax)  •Au  = 
this  is  the  same  as  saying 

At  ( y  —  Ax )  •  u  —  0 


for  all  u  eWl.  This  implies 


ATy—ATAx  =  0. 


0  for  all  u  G  M'!  and  from  Lemma  4.151, 


Therefore,  there  is  a  solution  to  the  equation  of  this  corollary,  and  it  solves  the  minimization  problem  of 
Theorem  4.150.  4 


Note  that  x  might  not  be  unique  but  Ax,  the  closest  point  of  A  (M'1)  to  y  is  unique  as  was  shown  in  the 
above  argument. 


Consider  the  following  example. 
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Solution.  First,  consider  whether  there  exists  a  real  solution.  To  do  so,  set  up  the  augmnented  matrix  given 
by 


2  1 

2  ' 

-1  3 

1 

4  5 

1 

The  reduced  row-echelon  form  of  this  augmented  matrix  is 


'  1  0 

0  ' 

0  1 

0 

i — 

o 

o 

1 

It  follows  that  there  is  no  real  solution  to  this  system.  Therefore  we  wish  to  find  the  least  squares 
solution.  The  normal  equations  are 


AT  Ax  =  ATy 


2-14 
1  3  5 


and  so  we  need  to  solve  the  system 


1 

2  1' 

r 

-1  3 

X 

— 

4  5 

_  y  _ 

- 

2-14 
1  3  5 


'  21 

19  ' 

X 

7  ' 

19 

35 

_  y  _ 

10 

1 

"  2  ' 

1 

1 

This  is  a  familiar  exercise  and  the  solution  is 


r  i 

r  5  i 

X 

34 

— 

7 

_  y  _ 

34 

* 


Consider  another  example. 
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Solution.  First,  consider  whether  there  exists  a  real  solution.  To  do  so,  set  up  the  augmnented  matrix  given 
by 


2  1 

1 - 

CO 

-1  3 

2 

4  5 

9  _ 

The  reduced  row-echelon  form  of  this  augmented  matrix  is 


'  1  0 

1  ' 

0  1 

1 

i 

o 

o 

0 

It  follows  that  the  system  has  a  solution  given  by  x  —  y  —  1.  However  we  can  also  use  the  normal 
equations  and  find  the  least  squares  solution. 


2  -1 
1  3 


4 

5 


Then 


The  least  squares  solution  is 


which  is  the  same  as  the  solution  found  above. 


X 

'  l  ' 

_  y  _ 

l 

2  1' 

-1  3 

X 

— 

2 

1 

4  5  _ 

_  y 

'21  19  ' 

X 

40  ' 

19  35 

y 

54 

4 

5 


3 

2 

9 


An  important  application  of  Corollary  4.152  is  the  problem  of  finding  the  least  squares  regression  line 
in  statistics.  Suppose  you  are  given  points  in  the  xy  plane 


{(xuyi)  ,(x2,y2)  ,■  ■  ■  ,{xn,yn)} 


and  you  would  like  to  find  constants  m  and  b  such  that  the  line  y  —  mx  +  b  goes  through  all  these  points. 
Of  course  this  will  be  impossible  in  general.  Therefore,  we  try  to  find  m,b  such  that  the  line  will  be  as 
close  as  possible.  The  desired  system  is 


>’i 

x\ 

1 " 

.  y"  . 

xn 

1 

which  is  of  the  form  y  =  Ax.  It  is  desired  to  choose  m  and  b  to  make 


A 

S 

— 

VI 

yn 
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as  small  as  possible.  According  to  Theorem  4.150  and  Corollary  4.152,  the  best  values  for  m  and  b  occur 
as  the  solution  to 


A7  A 

m 

h 

=  A7 

y\ 

,  where  A  = 

X\  1 

. y" . 

xn  1 

Thus,  computing  A7 A, 


E'=+  I Xsi  1 

m. 

'  HUvyi ' 

1 - 

M 

r 

+ 

s 

1 

b 

[  ILiyt  \ 

Solving  this  system  of  equations  for  m  and  b  (using  Cramer’s  rule  for  example)  yields: 

-(ILt^dLt^  +  dtt^On 

171  ——  - 

and 

t  -(g=i^)Lfai^.  +  (£?-i.vi)g-i4 
Consider  the  following  example. 


Solution.  In  this  case  we  have  n  =  5  data  points  and  we  obtain: 

LjU**' =  i°  I?=t^  =  14 

E?=1^i  =  38  I?=1*?  =  30 


and  hence 


m 


b 


-10*  14  +  5*38 
5  *30—  102 


-10*38+14*30 
5*30- 102 


0.80 


The  least  squares  regression  line  for  the  set  of  data  points  is: 

y  =  x+.8 

One  could  use  this  line  to  approximate  other  values  for  the  data.  For  example  for  x  —  6  one  could  use 
y(6)  =  6  +  .8  =  6.8  as  an  approximate  value  for  the  data. 

The  following  diagram  shows  the  data  points  and  the  corresponding  regression  line. 


256 


n 


* 

One  could  clearly  do  a  least  squares  fit  for  curves  of  the  form  y  —  ax 2  -\-bx-\-c  in  the  same  way.  In  this 
case  you  want  to  solve  as  well  as  possible  for  a,  b,  and  c  the  system 


r  2 

x\  X\ 

1 ' 

a 

b 

_ 

>’i 

_xl  x'l 

1 

c 

.  y*  . 

and  one  would  use  the  same  technique  as  above.  Many  other  similar  problems  are  important,  including 
many  in  higher  dimensions  and  they  are  all  solved  the  same  way. 


Exercises 


Exercise  4.11.1  Determine  whether  the  following  set  of  vectors  is  orthogonal.  If  it  is  orthogonal,  deter¬ 
mine  whether  it  is  also  orthonormal. 


'  IV2V3  ' 

r  jV2i 

1 

1 

^>1  — 

\V2y/3 

9 

0 

5 

.^y/2y/3_ 

UvsJ 

1^3  J 

If  the  set  of  vectors  is  orthogonal  but  not  orthonormal,  give  an  orthonormal  set  of  vectors  which  has  the 
same  span. 

Exercise  4.11.2  Determine  whether  the  following  set  of  vectors  is  orthogonal.  If  it  is  orthogonal,  deter¬ 
mine  whether  it  is  also  orthonormal. 


1  ' 

'  1  ' 

"  -1  ' 

2 

9 

0 

9 

1 

-1 

1 

1 
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If  the  set  of  vectors  is  orthogonal  but  not  orthonormal,  give  an  orthonormal  set  of  vectors  which  has  the 
same  span. 

Exercise  4.11.3  Determine  whether  the  following  set  of  vectors  is  orthogonal.  If  it  is  orthogonal,  deter¬ 
mine  whether  it  is  also  orthonormal. 


1 ' 

2  ' 

'  0  ' 

-1 

9 

1 

9 

1 

1 

-1 

1 

If  the  set  of  vectors  is  orthogonal  but  not  orthonormal,  give  an  orthonormal  set  of  vectors  which  has  the 
same  span. 

Exercise  4.11.4  Determine  whether  the  following  set  of  vectors  is  orthogonal.  If  it  is  orthogonal,  deter¬ 
mine  whether  it  is  also  orthonormal. 


1 ' 

2  ' 

'  1  ' 

-1 

9 

1 

9 

2 

1 

-1 

1 

If  the  set  of  vectors  is  orthogonal  but  not  orthonormal,  give  an  orthonormal  set  of  vectors  which  has  the 
same  span. 

Exercise  4.11.5  Determine  whether  the  following  set  of  vectors  is  orthogonal.  If  it  is  orthogonal,  deter¬ 
mine  whether  it  is  also  orthonormal. 


"  1  ' 

0  ' 

'  0  ' 

0 

1 

0 

0 

9 

-1 

9 

0 

0 

0 

1 

If  the  set  of  vectors  is  orthogonal  but  not  orthonormal,  give  an  orthonormal  set  of  vectors  which  has  the 
same  span. 

Exercise  4.11.6  Here  are  some  matrices.  Label  according  to  whether  they  are  symmetric,  skew  symmetric, 
or  orthogonal. 


n 

0 

0 

(a) 

0  73 

l 

73 

0  73 

l 

73 

1  2 

-3  ' 

(b) 

2  1 

4 

.  ~3  4 

7 

r  o  -2 

-3  ' 

(c) 

2 

0 

-4 

3 

4 

0 
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Exercise  4.11.7  For  U  an  orthogonal  matrix,  explain  why  =  ||x||  for  any  vector  x.  Next  explain  why 
ifU  is  an  n  x  n  matrix  with  the  property  that  \ \  Ux\ \  —  ||.r|  for  all  vectors,  x,  then  U  must  be  orthogonal. 
Thus  the  orthogonal  matrices  are  exactly  those  which  preserve  length. 

Exercise  4.11.8  Suppose  U  is  an  orthogonal  n  x  n  matrix.  Explain  why  rank(U)  —  n. 


Exercise  4.11.9  Fill  in  the  missing  entries  to  make  the  matrix  orthogonal. 

xl  xl  A. 

\/2  \/6  -\/3 

J_ 

s/2  ~  ~ 

V6 

-  3 


Exercise  4.11.10  Fill  in  the  missing  entries  to  make  the  matrix  orthogonal. 

2  VI  1/9 

3  2  6 vz" 

2 

3  -  - 

0 


Exercise  4.11.11  Fill  in  the  missing  entries  to  make  the  matrix  orthogonal. 

r  1 

3 

2 
3 


Exercise  4.11.12  Find  an  orthonormal  basis  for  the  span  of  each  of  the  following  sets  of  vectors. 
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Exercise  4.11.13  Using  the  Gram  Schmidt  process  find  an  orthonormal  basis  for  the  following  span: 


span 


Exercise  4.11.14  Using  the  Gram  Schmidt  process  find  an  orthonormal  basis  for  the  following  span: 


Exercise  4.11.15  The  set  V  — 

for  this  subspace. 


x 

y 

z 


:2x  +  3y  —  z  —  0  >  is  a  sub  space  of  R  .  Find  an  orthonormal  basis 


Exercise  4.11.16  Consider  the  following  scalar  equation  of  a  plane. 

2x  —  3y  +  z  =  0 


Find  the  orthogonal  complement  of  the  vector  v  = 
to  (3,4, 1) . 


3 

4 
1 


Also  find  the  point  on  the  plane  which  is  closest 


Exercise  4.11.17  Consider  the  following  scalar  equation  of  a  plane. 

x  +  3y  +  z  —  0 


Find  the  orthogonal  complement  of  the  vector  v  = 
to  (3,4,1). 


1 

2 

1 


Also  find  the  point  on  the  plane  which  is  closest 


Exercise  4.11.18  Let  v  be  a  vector  and  let  n  be  a  normal  vector  for  a  plane  through  the  origin.  Find  the 
equation  of  the  line  through  the  point  determined  by  v  which  has  direction  vector  n.  Show  that  it  intersects 
the  plane  at  the  point  determined  by  v  —  projfv.  Hint:  The  line:  v  +  In.  It  is  in  the  plane  ifn  •  (v  +  tn)  =  0. 
Determine  t.  Then  substitute  in  to  the  equation  of  the  line. 


Exercise  4.11.19  As  shown  in  the  above  problem,  one  can  find  the  closest  point  to  v  in  a  plane  through  the 
origin  by  finding  the  intersection  of  the  line  through  v  having  direction  vector  equal  to  the  normal  vector 
to  the  plane  with  the  plane.  If  the  plane  does  not  pass  through  the  origin,  this  will  still  work  to  find  the 
point  on  the  plane  closest  to  the  point  determined  by  v.  Here  is  a  relation  which  defines  a  plane 


2  x  +  y  +  z=  11 
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and  here  is  a  point:  (1,1,2).  Find  the  point  on  the  plane  which  is  closest  to  this  point.  Then  determine 
the  distance  from  the  point  to  the  plane  by  taking  the  distance  between  these  two  points.  Hint:  Line: 
(x,y,z)  —  (1, 1,2)  +t  (2, 1, 1) .  Now  require  that  it  intersect  the  plane. 

Exercise  4.11.20  In  general,  you  have  a  point  (a'q,  vo,zo)  and  a  scalar  equation  for  a  plane  ax+by+cz  —  d 
where  a2  +  b2  +  c2  >  0.  Determine  a  formula  for  the  closest  point  on  the  plane  to  the  given  point.  Then 
use  this  point  to  get  a  formula  for  the  distance  from  the  given  point  to  the  plane.  Hint:  Find  the  line 
perpendicular  to  the  plane  which  goes  through  the  given  point:  (x,y,z)  —  (xo,_vo,zo)  +t(a,b,c) .  Now 
require  that  this  point  satisfy  the  equation  for  the  plane  to  determine  t. 

Exercise  4.11.21  Find  the  least  squares  solution  to  the  following  system. 

x +  2 y  —  1 
2x  +  3  y  —  2 
3x  +  5y  =  4 


Exercise  4.11.22  You  are  doing  experiments  and  have  obtcdned  the  ordered  pairs, 

(0,1), (1,2), (2, 3.5), (3, 4) 

Find  m  and  b  such  thaty  —  mx  +  b  approximates  these  four  points  as  well  as  possible. 

Exercise  4.11.23  Suppose  you  have  several  ordered  triples,  ( Xj,yt,Zi )  ■  Describe  how  to  find  a  polynomial 
such  as 

r)  9 

z  —  a +  bx  +  cy  +  dxy  +  ex  +  fy 
giving  the  best  fit  to  the  given  ordered  triples. 
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4.12  Applications 


Outcomes 


A.  Apply  the  concepts  of  vectors  in  M"  to  the  applications  of  physics  and  work. 


4.12.1.  Vectors  and  Physics 


Suppose  you  push  on  something.  Then,  your  push  is  made  up  of  two  components,  how  hard  you  push  and 
the  direction  you  push.  This  illustrates  the  concept  of  force. 


Definition  4.156:  Force 


Force  is  a  vector.  The  magnitude  of  this  vector  is  a  measure  of  how  hard  it  is  pushing.  It  is  measured 
in  units  such  as  Newtons  or  pounds  or  tons.  The  direction  of  this  vector  is  the  direction  in  which 
the  push  is  taking  place. 


Vectors  are  used  to  model  force  and  other  physical  vectors  like  velocity.  As  with  all  vectors,  a  vector 
modeling  force  has  two  essential  ingredients,  its  magnitude  and  its  direction. 

Recall  the  special  vectors  which  point  along  the  coordinate  axes.  These  are  given  by 

<?,-  =  [0  •  ■  •  0  1  0---0]r 

where  the  1  is  in  the  ith  slot  and  there  are  zeros  in  all  the  other  spaces.  The  direction  of  eL  is  referred  to  as 
the  ith  direction. 

Consider  the  following  picture  which  illustrates  the  case  of  M3.  Recall  that  in  M3,  we  may  refer  to  these 
vectors  as  i,  j,  and  k. 


— »  T  ry 

Given  a  vector  u  =  [u\  ■  ■  ■  un]  ,  it  follows  that 

n 

U  —  U\6\  H - b  Un6n  — 

k=  1 
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What  does  addition  of  vectors  mean  physically?  Suppose  two  forces  are  applied  to  some  object.  Each 
of  these  would  be  represented  by  a  force  vector  and  the  two  forces  acting  together  would  yield  an  overall 
force  acting  on  the  object  which  would  also  be  a  force  vector  known  as  the  resultant.  Suppose  the  two 
vectors  are  u  =  ££=1  ulel  and  v  —  Y!k=i  v&-  Then  the  vector  u  involves  a  component  in  the  ith  direction 
given  by  u{eu  while  the  component  in  the  ith  direction  of  v  is  v,e;  .  Then  the  vector  u  +  v  should  have  a 
component  in  the  ith  direction  equal  to  (a,  +  v,-)  et.  This  is  exactly  what  is  obtained  when  the  vectors,  u  and 
v  are  added. 

U+V  =  [u\+V\- ■ -Un+Vn] 

n 

=  £(M/  +  Vj)e/ 

i=l 

Thus  the  addition  of  vectors  according  to  the  rules  of  addition  in  R”  which  were  presented  earlier, 
yields  the  appropriate  vector  which  duplicates  the  cumulative  effect  of  all  the  vectors  in  the  sum. 

Consider  now  some  examples  of  vector  addition. 


Example  4.157:  The  Resultant  of  Three  Forces 


There  are  three  ropes  attached  to  a  car  and  three  people  pull  on  these  ropes.  The  first  exerts  a  force 
of  F\  =2i  +  3j  —  2k  Newtons,  the  second  exerts  a  force  of  Fj  —  3/  +  5  /  +  k  Newtons  and  the  third 
exerts  a  force  of  5 i  —  j  +  2k  Newtons.  Find  the  total  force  in  the  direction  of  i. 


Solution.  To  find  the  total  force,  we  add  the  vectors  as  described  above.  This  is  given  by 

(2 i  T  3  /  —  2k)  T  (3 i  T  5j  T  k)  T  (5 i  —  j  T  2k) 

=  (2  +  3  +  5)/  +  (3  +  5  4 — l)j+( — 2  +  1  +  2)k 
=  1  Oz  T  7  j  T  k 

Hence,  the  total  force  is  10/  +  7  j  +  k  Newtons.  Therefore,  the  force  in  the  i  direction  is  10  Newtons.  4k 
Consider  another  example. 


Example  4.158:  Finding  a  Vector  from  Geometric  Description 


An  airplane  flies  North  East  at  1 00  miles  per  hour.  Write  this  as  a  vector. 


Solution.  A  picture  of  this  situation  follows. 


Therefore,  we  need  to  find  the  vector  u  which  has  length  100  and  direction  as  shown  in  this  diagram. 
We  can  consider  the  vector  u  as  the  hypotenuse  of  a  right  triangle  having  equal  sides,  since  the  direction 
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of  u  corresponds  with  the  45°  line.  The  sides,  corresponding  to  the  /  and  j  directions,  should  be  each  of 
length  100/ \/2.  Therefore,  the  vector  is  given  by 

_  100-  100- 

u  =  —j= H — -=  j 
y/2  y/2 

* 


100  100 

v/2  y/2 


This  example  also  motivates  the  concept  of  velocity,  defined  below. 


Definition  4.159:  Speed  and  Velocity 


The  speed  of  an  object  is  a  measure  of  how  fast  it  is  going.  It  is  measured  in  units  of  length  per  unit 
time.  For  example,  miles  per  hour,  kilometers  per  minute,  feet  per  second.  The  velocity  is  a  vector 
having  the  speed  as  the  magnitude  but  also  specifying  the  direction. 


Thus  the  velocity  vector  in  the  above  example  is  j,  while  the  speed  is  100  miles  per  hour. 

Consider  the  following  example. 


Example  4.160:  Position  From  Velocity  and  Time 


The  velocity  of  an  airplane  is  100/  +  j +  k  measured  in  kilometers  per  hour  and  at  a  certain  instant 
of  time  its  position  is  ( 1 , 2, 1 ) . 

Find  the  position  of  this  airplane  one  minute  later. 


Solution.  Here  imagine  a  Cartesian  coordinate  system  in  which  the  third  component  is  altitude  and  the 
first  and  second  components  are  measured  on  a  line  from  West  to  East  and  a  line  from  South  to  North. 

Consider  the  vector  [  1  2  1  ]  ,  which  is  the  initial  position  vector  of  the  airplane.  As  the  plane 
moves,  the  position  vector  changes  according  to  the  velocity  vector.  After  one  minute  (considered  as  of 
an  hour)  the  airplane  has  moved  in  the  i  direction  a  distance  of  100  x  ^  =  |  kilometer.  In  the  j  direction  it 
has  moved  ^  kilometer  during  this  same  time,  while  it  moves  ^  kilometer  in  the  k  direction.  Therefore, 
the  new  displacement  vector  for  the  airplane  is 

[1  2  llr+[^  —  ilr=[8  121  111  ]T 
[  1  z  1  J  ^  L  3  60  60  J  L  3  60  60  J 

* 


Now  consider  an  example  which  involves  combining  two  velocities. 


Example  4.161:  Sum  of  Two  Velocities 


A  certain  river  is  one  half  kilometer  wide  with  a  current  flowing  at  4  kilometers  per  hour  from  East 
to  West.  A  man  swims  directly  toward  the  opposite  shore  from  the  South  bank  of  the  river  at  a  speed 
of  3  kilometers  per  hour.  How  far  down  the  river  does  he  hnd  himself  when  he  has  swam  across? 
How  far  does  he  end  up  swimming? 


Solution.  Consider  the  following  picture  which  demonstrates  the  above  scenario. 
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* 


4 


3 


First  we  want  to  know  the  total  time  of  the  swim  across  the  river.  The  velocity  in  the  direction  across 
the  river  is  3  kilometers  per  hour,  and  the  river  is  \  kilometer  wide.  It  follows  the  trip  takes  1/6  hour  or 
10  minutes. 

Now,  we  can  compute  how  far  downstream  he  will  end  up.  Since  the  river  runs  at  a  rate  of  4  kilometers 
per  hour,  and  the  trip  takes  1/6  hour,  the  distance  traveled  downstream  is  given  by  4  (g)  =  |  kilometers. 

The  distance  traveled  by  the  swimmer  is  given  by  the  hypotenuse  of  a  right  triangle.  The  two  arms  of 
the  triangle  are  given  by  the  distance  across  the  river,  jkm,  and  the  distance  traveled  downstream,  |  km. 
Then,  using  the  Pythagorean  Theorem,  we  can  calculate  the  total  distance  d  traveled. 


d  = 


-km 

6 


Therefore,  the  swimmer  travels  a  total  distance  of  |  kilometers. 


* 


4.12.2.  Work 


The  mathematical  concept  of  work  is  an  application  of  vectors  in  R".  The  physical  concept  of  work  differs 
from  the  notion  of  work  employed  in  ordinary  conversation.  For  example,  suppose  you  were  to  slide  a 
150  pound  weight  off  a  table  which  is  three  feet  high  and  shuffle  along  the  floor  for  50  yards,  keeping  the 
height  always  three  feet  and  then  deposit  this  weight  on  another  three  foot  high  table.  The  physical  concept 
of  work  would  indicate  that  the  force  exerted  by  your  arms  did  no  work  during  this  project.  The  reason 
for  this  definition  is  that  even  though  your  arms  exerted  considerable  force  on  the  weight,  the  direction  of 
motion  was  at  right  angles  to  the  force  they  exerted.  The  only  part  of  a  force  which  does  work  in  the  sense 
of  physics  is  the  component  of  the  force  in  the  direction  of  motion. 

Work  is  defined  to  be  the  magnitude  of  the  component  of  this  force  times  the  distance  over  which  it 
acts,  when  the  component  of  force  points  in  the  direction  of  motion.  In  the  case  where  the  force  points 
in  exactly  the  opposite  direction  of  motion  work  is  given  by  (  —  1)  times  the  magnitude  of  this  component 
times  the  distance.  Thus  the  work  done  by  a  force  on  an  object  as  the  object  moves  from  one  point  to 
another  is  a  measure  of  the  extent  to  which  the  force  contributes  to  the  motion.  This  is  illustrated  in  the 
following  picture  in  the  case  where  the  given  force  contributes  to  the  motion. 


Recall  that  for  any  vector  u  in  Rn,  we  can  write  u  as  a  sum  of  two  vectors,  as  in 

U  =  M||  +  u l 
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For  any  force  F,  we  can  write  this  force  as  the  sum  of  a  vector  in  the  direction  of  the  motion  and  a  vector 
perpendicular  to  the  motion.  In  other  words, 


F  =  F\  |  +F± 

In  the  above  picture  the  force,  F  is  applied  to  an  object  which  moves  on  the  straight  line  from  P  to  Q. 
There  are  two  vectors  shown,  Fy  and  F j_  and  the  picture  is  intended  to  indicate  that  when  you  add  these 
two  vectors  you  get  F.  In  other  words,  F  =  Fj  |  +  Fj_ .  Notice  that  Fy  acts  in  the  direction  of  motion  and  F j_ 
acts  perpendicular  to  the  direction  of  motion.  Only  Fy  contributes  to  the  work  done  by  F  on  the  object  as  it 
moves  from  P  to  Q.  Fy  is  called  the  component  of  the  force  in  the  direction  of  motion.  From  trigonometry, 
you  see  the  magnitude  of  Fy  should  equal  ||F||  |cos  0| .  Thus,  since  Fy  points  in  the  direction  of  the  vector 
from  P  to  Q,  the  total  work  done  should  equal 

||F||||Pg||cos0  =  ||F||||g  —  p||cos0 

Now,  suppose  the  included  angle  had  been  obtuse.  Then  the  work  done  by  the  force  F  on  the  object 
would  have  been  negative  because  Fy  would  point  in  —1  times  the  direction  of  the  motion.  In  this  case, 
cos  0  would  also  be  negative  and  so  it  is  still  the  case  that  the  work  done  would  be  given  by  the  above 
formula.  Thus  from  the  geometric  description  of  the  dot  product  given  above,  the  work  equals 

It  II II  <7 — f||cos0  —  F»(q  —  p) 


This  explains  the  following  definition. 


Definition  4.162:  Work  Done  on  an  Object  by  a  Force 


Let  F  be  a  force  acting  on  an  object  which  moves  from  the  point  P  to  the  point  Q,  which  have 
position  vectors  given  by  p  and  q  respectively.  Then  the  work  done  on  the  object  by  the  given  force 
equals  F  •(q  —  p) . 


Consider  the  following  example. 


Example  4.163:  Finding  Work 


j  rp 

Let  F  =  [  2  7  — 3  ]  Newtons.  Find  the  work  done  by  this  force  in  moving  from  the  point 
(1,2,3)  to  the  point  (—9,  —3,4)  along  the  straight  line  segment  joining  these  points  where  distances 
are  measured  in  meters. 


Solution.  First,  compute  the  vector  q  —  p,  given  by 

1  T 


[-9  -3  4  ]  —  [  1  2  3  ]  =  [  —10  -5  1] 


According  to  Definition  4.162  the  work  done  is 

i  T 


[2  7  3  ]  •  [  —10  -5  1]  =-20  + (-35)  + (-3) 
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—  —  58  Newton  meters 


* 

Note  that  if  the  force  had  been  given  in  pounds  and  the  distance  had  been  given  in  feet,  the  units  on 
the  work  would  have  been  foot  pounds.  In  general,  work  has  units  equal  to  units  of  a  force  times  units  of 
a  length.  Recall  that  1  Newton  meter  is  equal  to  1  Joule.  Also  notice  that  the  work  done  by  the  force  can 
be  negative  as  in  the  above  example. 


Exercises 


Exercise  4.12.1  The  wind  blows  from  the  South  at  20  kilometers  per  hour  and  an  airplane  which  flies  at 
600  kilometers  per  hour  in  still  air  is  heading  East.  Find  the  velocity  of  the  airplane  and  its  location  after 
two  hours. 

Exercise  4.12.2  The  wind  blows  from  the  West  at  30  kilometers  per  hour  and  an  airplane  which  flies  at 
400  kilometers  per  hour  in  still  air  is  heading  North  East.  Find  the  velocity  of  the  airplane  and  its  position 
after  two  hours. 

Exercise  4.12.3  The  wind  blows  from  the  North  at  10  kilometers  per  hour.  An  airplane  which  flies  at  300 
kilometers  per  hour  in  still  air  is  supposed  to  go  to  the  point  whose  coordinates  are  at  ( 100, 100) .  In  what 
direction  should  the  airplane  fly? 


Exercise  4.12.4  Three  forces  act  on  an  object.  Two  are 
force  if  the  object  is  not  to  move. 


3  ' 

1  ' 

-1 

and 

-3 

-1 

4 

Newtons.  Find  the  third 


Exercise  4.12.5  Three  forces  act  on  an  object.  Two  are 


if  the  total  force  on  the  object  is  to  be 


7 

1 

3 


6  ' 

'  2  ' 

-3 

and 

1 

3 

3 

Newtons.  Find  the  third  force 


Exercise  4.12.6  A  river  flows  West  at  the  rate  ofb  miles  per  hour.  A  boat  can  move  at  the  rate  of  8  miles 
per  hour.  Find  the  smallest  value  ofb  such  that  it  is  not  possible  for  the  boat  to  proceed  directly  across  the 
river. 


Exercise  4.12.7  The  wind  blows  from  West  to  East  at  a  speed  of  50  miles  per  hour  and  an  airplane  which 
travels  at  400  miles  per  hour  in  still  air  is  heading  North  West.  What  is  the  velocity  of  the  airplane  relative 
to  the  ground?  What  is  the  component  of  this  velocity  in  the  direction  North? 
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Exercise  4.12.8  The  wind  blows  from  West  to  East  at  a  speed  of  60  miles  per  hour  and  an  airplane  can 
travel  travels  at  100  miles  per  hour  in  still  air.  How  many  degrees  West  of  North  should  the  airplane  head 
in  order  to  travel  exactly  North? 

Exercise  4.12.9  The  wind  blows  from  West  to  East  at  a  speed  of  50  miles  per  hour  and  an  airplane  which 
travels  at  400  miles  per  hour  in  still  air  heading  somewhat  West  of  North  so  that,  with  the  wind,  it  is  flying 
due  North.  It  uses  30.0  gallons  of  gas  every  hour.  If  it  has  to  travel  600.0  miles  due  North,  how  much  gas 
will  it  use  in  flying  to  its  destination  ? 

Exercise  4.12.10  An  airplane  is  flying  due  north  at  150.0  miles  per  hour  but  it  is  not  actually  going  due 
North  because  there  is  a  wind  which  is  pushing  the  airplane  due  east  at  40.0  miles  per  hour.  After  one 
hour,  the  plane  starts  flying  30°  East  of  North.  Assuming  the  plane  starts  at  (0,0),  where  is  it  after  2 
hours?  Let  North  be  the  direction  of  the  positive  y  axis  and  let  East  be  the  direction  of  the  positive  x  axis. 

Exercise  4.12.11  City  A  is  located  at  the  origin  (0, 0)  while  city  B  is  located  at  (300, 500)  where  distances 
are  in  miles.  An  airplane  flies  at  250  miles  per  hour  in  still  air.  This  airplane  wants  to  fly  from  city  A  to 
city  B  but  the  wind  is  blowing  in  the  direction  of  the  positive  y  axis  at  a  speed  of  50  miles  per  hour.  Find  a 
unit  vector  such  that  if  the  plane  heads  in  this  direction,  it  will  end  up  at  city  B  having  flown  the  shortest 
possible  distance.  How  long  will  it  take  to  get  there? 

Exercise  4.12.12  A  certain  river  is  one  half  mile  wide  with  a  current  flowing  at  2  miles  per  hour  from 
East  to  West.  A  man  swims  directly  toward  the  opposite  shore  from  the  South  bank  of  the  river  at  a  speed 
of  3  miles  per  hour.  How  far  down  the  river  does  he  find  himself  when  he  has  swam  across?  How  far  does 
he  end  up  traveling? 

Exercise  4.12.13  A  certain  river  is  one  half  mile  wide  with  a  current  flowing  at  2  miles  per  hour  from 
East  to  West.  A  man  can  swim  at  3  miles  per  hour  in  still  water.  In  what  direction  should  he  swim  in  order 
to  travel  directly  across  the  river?  What  would  the  answer  to  this  problem  be  if  the  river  flowed  at  3  miles 
per  hour  and  the  man  could  swim  only  at  the  rate  of  2  miles  per  hour? 

Exercise  4.12.14  Three  forces  are  applied  to  a  point  which  does  not  move.  Two  of  the  forces  are  2i  +  2  j  — 
6k  Newtons  and  8  i  +  8  /  +  3k  Newtons.  Find  the  third  force. 

Exercise  4.12.15  The  total  force  acting  on  an  object  is  to  be  4  i  +  2j  —  3k  Newtons.  A  force  of— 3i  —  1  j  +  8  k 
Newtons  is  being  applied.  What  other  force  should  be  applied  to  achieve  the  desired  total  force? 

Exercise  4.12.16  A  bird  flies  from  its  nest  8  km  in  the  direction  g7T  north  of  east  where  it  stops  to  rest 
on  a  tree.  It  then  flies  1  km  in  the  direction  due  southeast  and  lands  atop  a  telephone  pole.  Place  an  xy 
coordinate  system  so  that  the  origin  is  the  bird’s  nest,  and  the  positive  x  axis  points  east  and  the  positive  y 
axis  points  north.  Find  the  displacement  vector  from  the  nest  to  the  telephone  pole. 

Exercise  4.12.17  If  F  is  a  force  and  D  is  a  vector,  show  projg  (^  P  'j  —  (j  |  F  |  cos  0  j  u.  where  u  is  the  unit 

vector  in  the  direction  ofD,  where  u  —  D/\\D\\  and  9  is  the  included  angle  between  the  two  vectors,  F  and 
D.  1 1 F 1 1  cos  0  is  sometimes  called  the  component  of  the  force,  F  in  the  direction,  D. 
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Exercise  4.12.18  A  boy  drags  a  sled  for  100  feet  along  the  ground  by  pulling  on  a  rope  which  is  20  degrees 
from  the  horizontal  with  a  force  of  40  pounds.  How  much  work  does  this  force  do? 

Exercise  4.12.19  A  girl  drags  a  sled  for  200 feet  along  the  ground  by  pulling  on  a  rope  which  is  30  degrees 
from  the  horizontal  with  a  force  of  20  pounds.  How  much  work  does  this  force  do? 

Exercise  4.12.20  A  large  dog  drags  a  sled  for  300 feet  along  the  ground  by  pulling  on  a  rope  which  is  45 
degrees  from  the  horizontal  with  a  force  of  20  pounds.  How  much  work  does  this  force  do? 

Exercise  4.12.21  How  much  work  does  it  take  to  slide  a  crate  20  meters  along  a  loading  dock  by  pulling 
on  it  with  a  200  Newton  force  at  an  angle  of  30°  from  the  horizontal?  Express  your  answer  in  Newton 
meters. 

Exercise  4.12.22  An  object  moves  10  meters  in  the  direction  of  j.  There  are  two  forces  acting  on  this 
object,  F\  —  i  +  j  +  2k,  and  Fi  —  —5  i  +  2  j  —  6k.  Find  the  total  work  done  on  the  object  by  the  two  forces. 
Hint:  You  can  take  the  work  done  by  the  resultant  of  the  two  forces  or  you  can  add  the  work  done  by  each 
force.  Why? 

Exercise  4.12.23  An  object  moves  10  meters  in  the  direction  of  j  +  i.  There  are  two  forces  acting  on  this 
object,  F\  —  i  +  2j  +  2k,  and  F^  =  5/  +  2  /  —  6k.  Find  the  toted  work  done  on  the  object  by  the  two  forces. 
Hint:  You  can  take  the  work  done  by  the  resultant  of  the  two  forces  or  you  can  add  the  work  done  by  each 
force.  Why? 

Exercise  4.12.24  An  object  moves  20  meters  in  the  direction  ofk-\-  j.  There  are  two  forces  acting  on  this 
object,  F\  —  i  +  j  +  2k,  and  F2  —  i  +  2j  —  6k.  Find  the  total  work  done  on  the  object  by  the  two  forces. 
Hint:  You  can  take  the  work  done  by  the  resultant  of  the  two  forces  or  you  can  add  the  work  done  by  each 
force. 
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5.1  Linear  Transformations 


Outcomes 


A.  Understand  the  definition  of  a  linear  transformation,  and  that  all  linear  transformations  are 
determined  by  matrix  multiplication. 


Recall  that  when  we  multiply  an  m  x  n  matrix  by  an  n  x  1  column  vector,  the  result  is  an  m  x  1  column 
vector.  In  this  section  we  will  discuss  how,  through  matrix  multiplication,  an  m  x  n  matrix  transforms  an 
n  x  1  column  vector  into  an  m  x  1  column  vector. 


Recall  that  the  n  x  1  vector  given  by 

xi 

X2 

x  — 

Xji 


is  said  to  belong  to  R",  which  is  the  set  of  all  n  x  1  vectors.  In  this  section,  we  will  discuss  transformations 
of  vectors  in  R". 


Consider  the  following  example. 


Example  5.1:  A  Function  Which  Transforms  Vectors 

Consider  the  matrix  A  — 

R3  into  vectors  in  R2. 

'12  0' 
2  1  0 

.  Show  that  by  matrix  multiplication  A  transforms  vectors  in 

Solution.  First,  recall  that  vectors  in  R3  are  vectors  of  size  3x1,  while  vectors  in  R2  are  of  size  2x1.  If 
we  multiply  A,  which  is  a  2  x  3  matrix,  by  a  3  x  1  vector,  the  result  will  be  a  2  x  1  vector.  This  what  we 
mean  when  we  say  that  A  transforms  vectors. 


Now,  for 


x 

y 


in  R3  ,  multiply  on  the  left  by  the  given  matrix  to  obtain  the  new  vector.  This  product 


looks  like 


z 


1  2  0 
2  1  0 


z 


x  +  2y 
2  x  +  y 
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The  resulting  product 
numerical  examples. 


is  a  2  x  1  vector  which  is  determined  by  the  choice  of  x 


1  2  0 
2  1  0 


1 

2 

3 


5 

4 


and  y. 


Here  are  some 


Here,  the  vector 


1 

2 

3 


in  R3  was  transformed  by  the  matrix  into  the  vector 


Here  is  another  example: 


10  ' 

1  2  0 

'  20  ' 

2  1  0 

5 

_  -3  _ 

25 

5 

4 


in  R2. 


* 


The  idea  is  to  define  a  function  which  takes  vectors  in  R3  and  delivers  new  vectors  in  R2.  In  this  case, 
that  function  is  multiplication  by  the  matrix  A. 

Let  T  denote  such  a  function.  The  notation  T  :  R"  H »  Rm  means  that  the  function  T  transforms  vectors 
in  R"  into  vectors  in  R'".  The  notation  T  (x)  means  the  transformation  T  applied  to  the  vector  x.  The  above 
example  demonstrated  a  transformation  achieved  by  matrix  multiplication.  In  this  case,  we  often  write 


Ta  (x)  =  Ax 


Therefore,  Ta  is  the  transformation  determined  by  the  matrix  A.  In  this  case  we  say  that  T  is  a  matrix 
transformation. 

Recall  the  property  of  matrix  multiplication  that  states  that  for  k  and  p  scalars, 

A  (kB  +  pC )  =  kAB  +  pAC 

In  particular,  for  A  an  m  x  n  matrix  and  B  and  C,  n  x  1  vectors  in  R",  this  formula  holds. 

In  other  words,  this  means  that  matrix  multiplication  gives  an  example  of  a  linear  transformation, 
which  we  will  now  define. 


Definition  5.2:  Linear  Transformation 


Let  T  :  R'!  H »  Rm  be  a  function,  where  for  each  x  e  R n,T  (x)  6  R"1.  Then  T  is  a  linear  transforma¬ 
tion  if  whenever  k,p  are  scalars  and  x\  andx 2  are  vectors  in  Rn  (n  x  1  vectors ), 

T  (kx  1  +  px 2)  =  kT  (Tj)  +  pT  (I2) 


Consider  the  following  example. 
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Solution.  By  Definition  9.55  we  need  to  show  that  T  (kx\  +  px 2)  =  kT  (x\ )  +  pT  (x2)  for  all  scalars  k,p 
and  vectors  x\  ,x2.  Let 


x\ 


Then 


x\ 

>’1 

Zl 


T  (  kx  \  +  px 2)  =  T 


Xl 

Xl  = 

yi 

. 22 . 

Xl 

X2 

>’l 

+p 

yi 

Zl 

Z2 

=  T 


=  T 


kx  1 

£>’1  + 

kz\ 

kx  1  +  /;.v'2 

fe’i  +  pyi 
kz\  +  /X~2 

(fed  +  px2)  +  {kyi  +  py2 ) 

(kx  1  +  px2)  -  (fei  +  pz2) 

(fed  +  *yi )  +  (px  2 + pyi) 
(kx  1  -  fei )  +  (px2  -  pz2) 


=  k 


kx  1  +/cyi 
fee  |  —  fei 

*1  +Ti 
*1  -  Zl 
kT(xi )  +  pT(x2) 


+ 


+  P 


px  2  +  PIT 
px2  -  /?Z2 

*2+y2 

X2~Z2 


Therefore  T  is  a  linear  transformation.  4k 

Two  important  examples  of  linear  transformations  are  the  zero  transformation  and  identity  transfor¬ 
mation.  The  zero  transformation  defined  by  T  (x)  =  (0)  for  all  x  is  an  example  of  a  linear  transformation. 
Similarly  the  identity  transformation  defined  by  T  (x)  =  (x)  is  also  linear.  Take  the  time  to  prove  these 
using  the  method  demonstrated  in  Example  5.3. 

We  began  this  section  by  discussing  matrix  transformations,  where  multiplication  by  a  matrix  trans¬ 
forms  vectors.  These  matrix  transformations  are  in  fact  linear  transformations. 
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Theorem  5.4:  Matrix  Transformations  are  Linear  Transformations 


Let  T  :  M"  n-  Mm  be  a  transformation  defined  by  T (3c)  =  Ax.  Then  T  is  a  linear  transformation. 


It  turns  out  that  every  linear  transformation  can  be  expressed  as  a  matrix  transformation,  and  thus  linear 
transformations  are  exactly  the  same  as  matrix  transformations. 


Exercises 


Exercise  5.1.1  Show  the  map  T  :  M'7  i— >•  Wn  defined  by  T  (3c)  =  Ax  where  A  is  an  m  x  n  matrix  and  x  is  an 
m  x  1  column  vector  is  a  linear  transformation. 

Exercise  5.1.2  Show  that  the  function  defined  by  T„  (v)  =  v—proju  (v)  is  also  a  linear  transformation. 

Exercise  5.1.3  Let  u  be  a  fixed  vector.  The  function  Tn  defined  by  Tfv  =  u  +  v  has  the  effect  of  translating 
all  vectors  by  adding  u  f  0.  Show  this  is  not  a  linear  transformation.  Explain  why  it  is  not  possible  to 
represent  T jj  in  M3  by  multiplying  by  a  3  x  3  matrix. 


5.2  The  Matrix  of  a  Linear  Transformation  I 


Outcomes 


A.  Find  the  matrix  of  a  linear  transformation  with  respect  to  the  standard  basis. 

B.  Determine  the  action  of  a  linear  transformation  on  a  vector  in  M'!. 


In  the  above  examples,  the  action  of  the  linear  transformations  was  to  multiply  by  a  matrix.  It  turns 
out  that  this  is  always  the  case  for  linear  transformations.  If  T  is  any  linear  transformation  which  maps 
K'7  to  M"7,  there  is  always  an  m  x  n  matrix  A  with  the  property  that 

T  (x)  =  Ax  (5.1) 


for  all  xel", 


Theorem  5.5:  Matrix  of  a  Linear  Transformation 


Let  T  :  M'7 1— >•  M'77  be  a  linear  transformation.  Then  we  can  find  a  matrix  A  such  that  T (3c)  =  Ax.  In 
this  case,  we  say  that  T  is  determined  or  induced  by  the  matrix  A. 


5.2.  The  Matrix  of  a  Linear  Transformation  I  273 


Here  is  why.  Suppose  T  :  K"  n-  Wn  is  a  linear  transformation  and  you  want  to  find  the  matrix  defined 
by  this  linear  transformation  as  described  in  5.1.  Note  that 


x  — 


X, 

'  1  ■ 

'  0  ' 

'  0  ' 

X2 

=  X\ 

0 

T  X2 

1 

+  • 

•  -fxn 

0 

Xn 

0 

0 

1 

=  Yj  A'Tz 
;=1 


where  <?,•  is  the  ith  column  of  In,  that  is  the  n  x  1  vector  which  has  zeros  in  every  slot  but  the  itb  and  a  1  in 
this  slot. 


Then  since  T  is  linear, 


T(x)  = 


Yx‘T  & 

i=  1 


Tie,) 


1  ■ 

x. 

en) 

1  . 

Xn 

=  A 


x\ 


The  desired  matrix  is  obtained  from  constructing  the  ith  column  as  T  (e,-) .  Recall  that  the  set  {<?i  ,<?2,  •  •  • ,  en  } 
is  called  the  standard  basis  of  W1.  Therefore  the  matrix  of  T  is  found  by  applying  T  to  the  standard  basis. 
We  state  this  formally  as  the  following  theorem. 


The  following  Corollary  is  an  essential  result. 


Corollary  5.7:  Matrix  and  Linear  Transformation 


A  transformation  T  :  M"  — »  W"  is  a  linear  transformation  if  and  only  if  it  is  a  matrix  transformation. 


Consider  the  following  example. 
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Solution.  By  Theorem  5.6  we  construct  A  as  follows: 


A  = 


7- Pi) 


In  this  case,  A  will  be  a  2  x  3  matrix,  so  we  need  to  find  T  (e\) ,  T  (^2) ,  and  T  (£3).  Luckily,  we  have 
been  given  these  values  so  we  can  fill  in  A  as  needed,  using  these  vectors  as  the  columns  of  A.  Hence, 


A 


1  9  1 

2  -3  1 


* 

In  this  example,  we  were  given  the  resulting  vectors  of  T  (e\)  ,T  (<22) ,  and  T  (<23).  Constructing  the 
matrix  A  was  simple,  as  we  could  simply  use  these  vectors  as  the  columns  of  A.  The  next  example  shows 
how  to  find  A  when  we  are  not  given  the  T  (eL)  so  clearly. 


Solution.  By  Theorem  5.6  to  find  this  matrix,  we  need  to  determine  the  action  of  T  on  e\  and  <22.  In 
Example  9.91,  we  were  given  these  resulting  vectors.  However,  in  this  example,  we  have  been  given  T 
of  two  different  vectors.  How  can  we  find  out  the  action  of  T  on  <21  and  eAl  In  particular  for  e\ ,  suppose 
there  exist  x  and  y  such  that 


1 

0 


+  v 


0 

-1 


(5.2) 


Then,  since  T  is  linear, 


'  1  ' 

'  1  ' 

0 ' 

T 

0 

=  xT 

1 

+yT 

-1 
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Substituting  in  values,  this  sum  becomes 


T 


1 

0 


(5.3) 


Therefore,  if  we  know  the  values  of  x  and  y  which  satisfy  5.2,  we  can  substitute  these  into  equation 
5.3.  By  doing  so,  we  find  T  (e\)  which  is  the  first  column  of  the  matrix  A. 

We  proceed  to  find  x  and  y.  We  do  so  by  solving  5.2,  which  can  be  done  by  solving  the  system 

x  —  1 

x  — y  =  0 


We  see  that  x  =  1  and  y  =  1  is  the  solution  to  this  system.  Substituting  these  values  into  equation  5.3, 
we  have 


1 

0 


=  1 


1 

2 


+  1 


3 

2 


1 

2 


+ 


3 

2 


4 

4 


Therefore 


4 

4 


is  the  first  column  of  A. 


Computing  the  second  column  is  done  in  the  same  way,  and  is  left  as  an  exercise. 
The  resulting  matrix  A  is  given  by 

A-’4  ~3 
A~  4  -2 


This  example  illustrates  a  very  long  procedure  for  finding  the  matrix  of  A.  While  this  method  is  reliable 
and  will  always  result  in  the  correct  matrix  A,  the  following  procedure  provides  an  alternative  method. 


Procedure  5.10:  Finding  the  Matrix  of  Inconveniently  Defined  Linear  Transformation 


Suppose  T  :  W2  — »  Wn  is  a  linear  transformation.  Suppose  there  exist  vectors  { <3 1 ,  •  •  •  ,a„j  in  Rn 
such  that  [  ai  ■■  ■  an  ]  1  exists,  and 

T  (di)  =  hi 

Then  the  matrix  ofT  must  be  of  the  form 

[b\  bn][di  ■■■  dn] 


We  will  illustrate  this  procedure  in  the  following  example.  You  may  also  find  it  useful  to  work  through 
Example  5.9  using  this  procedure. 
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Example  5.11:  Matrix  of  a  Linear  Transformation 
Given  Inconveniently 


Suppose  T  :  R3  — »  R3  is  a  linear  transformation  and 


1 

0 

0 

2 

1 

0 

T 

3 

= 

1 

,T 

1 

= 

1 

,T 

1 

= 

0 

1 

1 

1 

3 

0 

1 

Find  the  matrix  of  this  linear  transformation. 


'10  1' 

-l 

1 - 

o 

<N 

O 

Solution.  By  Procedure  5.10,  A  = 

3  1  1 

1  1  0 

and#  = 

1  1  0 

1  3  1 

Then,  Procedure  5.10  claims  that  the  matrix  of  T  is 


C  =  BA-1 


2-2  4 
0  0  1 
4-3  6 


Indeed  you  can  first  verify  that  T (3c)  —  Cx  for  the  3  vectors  above: 


2 

-2 

4  ' 

'  1  ' 

'  0  ' 

'  2 

-2 

4  ' 

'  0  ' 

'  2  ' 

0 

0 

1 

3 

= 

1 

9 

0 

0 

1 

1 

= 

1 

4 

-3 

6 

1 

1 

4 

-3 

6 

1 

3 

2-24' 

"  1  ' 

0  0  1 

1 

= 

4-3  6 

0 

But  more  generally  T (3c)  =  Cx  for  any  x.  To  see  this,  let  y  =  A  '.7  and  then  using  linearity  of  T : 


T(x)  =  T(Ay )  =  T 


=  Yjy,T ("')  YJfo  =  By  =  BA  '*  = c* 


* 


Recall  the  dot  product  discussed  earlier.  Consider  the  map  vi->  proj5(v)  which  takes  a  vector  a  trans¬ 
forms  it  to  its  projection  onto  a  given  vector  u.  It  turns  out  that  this  map  is  linear,  a  result  which  follows 
from  the  properties  of  the  dot  product.  This  is  shown  as  follows. 


pr°j  nikv  +  pw) 


(kv  +  pw )  •  u 


u»u 
v»  u 


u  •  u 


u  +  p 


wu 

u»u 


k  Proj«  (v)  +  P  proj^  (vv) 


Consider  the  following  example. 
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Solution. 


1.  First,  we  have  just  seen  that  T(v)  —  proj7  (V)  is  linear.  Therefore  by  Theorem  5.5,  we  can  find  a 
matrix  A  such  that  T  (x)  =  Ax. 

2.  The  columns  of  the  matrix  for  T  are  defined  above  as  T(e,).  It  follows  that  T (ei)  =  proj(7  (F;)  gives 
the  ith  column  of  the  desired  matrix.  Therefore,  we  need  to  find 


prois  (l)  = 


ei  •  u 


u»u 


For  the  given  vector  u  ,  this  implies  the  columns  of  the  desired  matrix  are 


1 

14 


1 

2 

3 


1 

2 

3 


which  you  can  verify.  Hence  the  matrix  of  T  is 


1 

14 


1 

2 

3 


2 

4 

6 


3 

6 

9 


4 


Exercises 


Exercise  5.2.1  Consider  the  following  functions  which  map  W1  to  R". 
(a)  T  multiplies  the  fh  component  ofx  by  a  nonzero  number  b. 
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(b)  T  replaces  the  Ith  component  ofx  with  b  times  the  jth  component  added  to  the  ith  component. 

(c)  T  switches  the  ith  and  jth  components. 

Show  these  functions  are  linear  transformations  and  describe  their  matrices  A  such  that  T  (x)  =  Ax. 

Exercise  5.2.2  You  are  given  a  linear  transformation  T  :  — >  Wn  and  you  know  that 

T  (Ai)  =  Bj 

where  [  A\  ■■■  An  ]  1  exists.  Show  that  the  matrix  of  T  is  of  the  form 

[Bl  Bn  ]  [Aj  •••  An  j'1 

Exercise  5.2.3  Suppose  T  is  a  linear  transformation  such  that 

1  1  [  5  ' 

T  2  =  1 

-6  \  [  3  _ 

"  -i  i  r  i  ' 

T  - 1  =  1 

5  J  [  5 


Find  the  matrix  ofT.  That  is  find  A  such  that  T  (x)  —  Ax. 

Exercise  5.2.4  Suppose  T  is  a  linear  transformation  such  that 

1  1  [  1  ' 

T  1  =  3 

_  -8  J  [  1  . 

'  -i  i  r  2 " 

T  0  =  4 

6  J  L 1 


Find  the  matrix  ofT.  That  is  find  A  such  that  T  (x)  —  Ax. 

Exercise  5.2.5  Suppose  T  is  a  linear  transformation  such  that 

1  ]  [  -3  ' 

T  3  1 

-7  J  [  3 
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"  -1 ' 

1  ' 

-2 

= 

3 

6  . 

_  -3  _ 

0  ' 

5  ' 

-1 

3 

2 

-3 

Find  the  matrix  ofT.  That  is  find  A  such  that  T  (3c)  =  Ax. 
Exercise  5.2.6  Suppose  T  is  a  linear  transformation  such  that 


1  ' 

'  3  ' 

T 

1 

= 

3 

-7 

_  3  _ 

'  -1  ' 

'  1  ' 

T 

0 

= 

2 

6  _ 

_  3  _ 

0  ' 

1 

T 

-1 

= 

3 

2 

-1 

Find  the  matrix  ofT.  That  is  find  A  such  that  T  (x)  =  Ax. 


Exercise  5.2.7  Suppose  T  is  a  linear  transformation  such  that 


T 


1 

2 


-18 


Find  the  matrix  of  T. 


T 


T 


That  is  find  A  such  that 


-1 

-1 

15  . 
0  ' 
-1 
4 

T(x) 


Ax. 


5 

2 

5 

3 

3 

5 


2 

5 

-2 


Exercise  5.2.8  Consider  the  following  functions  T  :  R3  — »  M2.  Show  that  each  is  a  linear  transformation 
and  determine  for  each  the  matrix  A  such  that  T(x)  —  Ax. 


(a)  T 


x 

y 

z 


x  +  2y  +  3z 
2 y  -3x  +  z 


x 


(b)  T 


y 


z 


7x  +  2  y  +  z 
3x  —  1  ly  +  2z 
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(c)  T 


x 

y 

z 


3x  +  2y  +  z 
x+2y  +  6z 


x 


( d)  T 


y 


z 


2y-5x  +  z 
x  +  y  +  z 


Exercise  5.2.9  Consider  the  following  functions  T  :  R3  — *  R2.  Explain  why  each  of  these  functions  T  is 
not  linear. 


(a)  T 


x 

y 

z 


x  +  2y  +  3z  +  1 
2y  -3x  +  z 


(b)  T 


x 

y 

z 


x  +  2y2  +  3z 
2y  +  3x  +  z 


(c)  T 


x 

y 

z 


sin  a;  +  2 y  +  3 z 
2y  +  3x  +  z 


(d)  T 


x 

y 

z 


x  +  2y  +  3z 
2y  +  3x  —  lnz 


Exercise  5.2.10  Suppose 

[  Aj  ■■■  An  j_1 

exists  where  each  Aj  e  Wl  and  let  vectors  {B\,  ■  ■  ■  ,Bn}  in  R"5  be  given.  Show  that  there  always  exists  a 
linear  transformation  T  such  that  T  (A,)  =  If. 

Exercise  5.2.11  Find  the  matrix  for  T  (w)  =  proj$(w)  where  v  =  [  1  —2  3  ]  . 

Exercise  5.2.12  Find  the  matrix  for  T  (vv)  =  proj^(w)  where  v  =  [  1  5  3  ] 1  . 

Exercise  5.2.13  Find  the  matrix  for  T  (w)  =  proj$(w)  where  v  =  [10  3]. 
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5.3  Properties  of  Linear  Transformations 


Outcomes 


A.  Use  properties  of  linear  transformations  to  solve  problems. 

B.  Find  the  composite  of  transformations  and  the  inverse  of  a  transformation. 

Let  T  :  M"  i— >  E"!  be  a  linear  transformation.  Then  there  are  some  important  properties  of  T  which  will 
be  examined  in  this  section.  Consider  the  following  theorem. 


These  properties  are  useful  in  determining  the  action  of  a  transformation  on  a  given  vector.  Consider 
the  following  example. 
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Solution.  Using  the  third  propertv  in  Theorem  9.57,  we  can  find  T 

-7 

3 

by  writing 

-7 

3 

-9 

-9 

'  1  ' 

"  4  ' 

combination  of 

3 

and 

0 

1 

5 

as  a  linear 


Therefore  we  want  to  find  a,  b  G  R  such  that 


'  -7  ' 

"  1  ' 

'  4  ' 

3 

=  a 

3 

+  b 

0 

-9 

1 

5 

The  necessary  augmented  matrix  and  resulting  reduced  row-echelon  form  are  given  by: 


"  1  4 

-7  ' 

'  1  0 

1  ' 

3  0 

3 

— >• - > 

0  1 

-2 

1  5 

-9 

o 

o 

0 

Hence  a  =  l,b  =  —  2  and 


"  -7  ' 

'  1  ' 

'  4  ' 

3 

-9 

=  1 

3 

1 

+  (-2) 

0 

5 

Now,  using  the  third  property  above,  we  have 


"  -7  ' 
3 

=  T  | 

fi 

'  1  " 

3 

-9 

K 

1 

4 

4 

0 

-2 


"  1  ' 

3 

1 

-IT 

+  (-2) 


4 
0 

5 


4 
0 

5 


1 - 

-2 

5 

-1 

5 

IT 
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-4 

-6 

2 

-12 


Therefore,  T 


* 


Suppose  two  linear  transformations  act  in  the  same  way  on  x  for  all  vectors, 
transformations  are  equal. 


Then  we  say  that  these 


Suppose  two  linear  transformations  act  on  the  same  vector  x,  first  the  transformation  T  and  then  a 
second  transformation  given  by  S.  We  can  find  the  composite  transformation  that  results  from  applying 
both  transformations. 


Notice  that  the  resulting  vector  will  be  in  M"!.  Be  careful  to  observe  the  order  of  transformations.  We 
write  SoT  but  apply  the  transformation  T  first,  followed  by  S. 


Theorem  5.17:  Composition  of  Transformations 


Let  T  :  Mk  i— >■  M'7  and  S  :  Mn  (->•  Wn  be  linear  transformations  such  that  T  is  induced  by  the  matrix 
A  and  S  is  induced  by  the  matrix  B.  Then  SoT  is  a  linear  transformation  which  is  induced  by  the 
matrix  BA. 


Consider  the  following  example. 
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Solution.  By  Theorem  5.17,  the  matrix  of  So  T  is  given  by  BA. 


'  2 

3  ' 

'12' 

'  8 

4  ' 

0 

1 

2  0 

2 

0 

To  find  (So  T)  (3c),  multiply  x  by  BA  as  follows 


'  8 

4  ' 

'  1  ' 

'  24  ' 

2 

0 

4 

2 

To  check,  first  determine  T  (3c) : 


'12' 

'  1  ' 

'  9  ' 

2  0 

4 

2 

Then,  compute  S(T(x))  as  follows: 


'  2 

3  ' 

'  9  ' 

'  24  ' 

0 

1 

2 

2 

* 


Consider  a  composite  transformation  SoT,  and  suppose  that  this  transformation  acted  such  that  (S o 
T)(x)  —  x.  That  is,  the  transformation  S  took  the  vector  T (3c)  and  returned  it  to  3c.  In  this  case,  S  and  T  are 
inverses  of  each  other.  Consider  the  following  definition. 
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The  following  theorem  is  crucial,  as  it  claims  that  the  above  inverse  transformations  are  unique. 


Theorem  5.20:  Inverse  of  a  Transformation 


Let  T  :  M'7  n-  W  be  a  linear  transformation  induced  by  the  matrix  A.  Then  T  has  an  inverse  trans¬ 
formation  if  and  only  if  the  matrix  A  is  invertible.  In  this  case,  the  inverse  transformation  is  unique 
and  denoted  T  1  :  M'7  n-  M”.  T  1  is  induced  by  the  matrix  A  1 . 


Consider  the  following  example. 


Solution.  Since  the  matrix  A  is  invertible,  it  follows  that  the  transformation  T  is  invertible.  Therefore,  T  1 
exists. 

You  can  verify  that  A-1  is  given  by: 


Therefore  the  linear  transformation  T  1  is  induced  by  the  matrix  A  1 .  4k 


Exercises 


Exercise  5.3.1  Show  that  if  a  function  T  :  R"  — Y  M"7  is  linear,  then  it  is  always  the  case  that  T  ( 0 )  =  0. 


Exercise  5.3.2  Let  T  be  a  linear  transformation  induced  by  the  matrix  A  = 


3  1 
-1  2 


transformation  induced  by  B  — 


0  -2 
4  2 


.  Find  matrix  of  SoT  and  find  (So  T)  (x)  for  x 


and  S  a  linear 
2 

-1 


Exercise  5.3.3  Let  T  be  a  linear  transformation  and  suppose  T 


1 

-4 


2 

-3 


linear  transformation  induced  by  the  matrix  B 


1  2 
-1  3 


.  Find  (So  T)  (x)  for  x  - 


.  Suppose  S  is  a 

1 
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Exercise  5.3.4  Let  T  be  a  linear  transformation  induced  by  the  matrix  A  = 


2  3 
1  1 


and  S  a  linear 


transformation  induced  by  B 


-1  3 

1  -2 


.  Find  matrix  of  SoT  and  find  ( S  o  T )  (x)  for  x  — 


5 

6 


Exercise  5.3.5  Let  T  be  a  linear  transformation  induced  by  the  matrix  A 
T~l. 


Exercise  5.3.6  Let  T  be  a  linear  transformation  induced  by  the  matrix  A 
ofT-K 


Exercise  5.3.7  Let  T  be  a  linear  transformation  and  suppose  T 
^  .  Find  the  matrix  of  T  1 . 


1 

2 


2  1 
5  2 


4  -3 
2  -2 


9 

8 


Find  the  matrix  of 


.  Find  the  matrix 


5.4  Special  Linear  Transformations  in  M2 


In  this  section,  we  will  examine  some  special  examples  of  linear  transformations  in  M2  including  rota¬ 
tions  and  reflections.  We  will  use  the  geometric  descriptions  of  vector  addition  and  scalar  multiplication 
discussed  earlier  to  show  that  a  rotation  of  vectors  through  an  angle  and  reflection  of  a  vector  across  a  line 
are  examples  of  linear  transformations. 

More  generally,  denote  a  transformation  given  by  a  rotation  by  T.  Why  is  such  a  transformation  linear? 
Consider  the  following  picture  which  illustrates  a  rotation.  Let  u,v  denote  vectors. 


u 
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Let’s  consider  how  to  obtain  T  (u  +  v).  Simply,  you  add  T(u )  and  T(v).  Here  is  why.  If  you  add 
T (u)  to  T (v)  you  get  the  diagonal  of  the  parallelogram  determined  by  T (u)  and  T (v),  as  this  action  is  our 
usual  vector  addition.  Now,  suppose  we  first  add  u  and  v,  and  then  apply  the  transformation  T  to  u  +  v. 
Hence,  we  find  T(u  +  v).  As  shown  in  the  diagram,  this  will  result  in  the  same  vector.  In  other  words, 

T(u  +  v)  =  T(u )  +  T(v). 

This  is  because  the  rotation  preserves  all  angles  between  the  vectors  as  well  as  their  lengths.  In  par¬ 
ticular,  it  preserves  the  shape  of  this  parallelogram.  Thus  both  T  (u)  +  T  (v)  and  7’  (u  +  v)  give  the  same 
vector.  It  follows  that  T  distributes  across  addition  of  the  vectors  of  R2. 

Similarly,  if  k  is  a  scalar,  it  follows  that  T  ( ku, )  =  kT  (u).  Thus  rotations  are  an  example  of  a  linear 
transformation  by  Definition  9.55. 

The  following  theorem  gives  the  matrix  of  a  linear  transformation  which  rotates  all  vectors  through  an 
angle  of  0 . 


r  1 

Theorem  5.22:  Rotation 

Let  Rg  :  R2  — »  R2  be  a  linear  transforn 
the  matrix  A  of  Rq  is  given  by 

nation  given  by  rotatii 

cos  (0)  — sin(0) 
sin(0)  cos(0) 

ig  vectors  through  an  angle  of  0 .  Then 

Proof.  Let  e\  = 

x  axis  and  positive  y 


and  <??  = 

L  i 

axis  as  shown. 


These  identify  the  geometric  vectors  which  point  along  the  positive 


From  Theorem  5.6,  we  need  to  find  Re(ei)  and  Rq^),  and  use  these  as  the  columns  of  the  matrix  A 
of  T.  We  can  use  cos, sin  of  the  angle  0  to  find  the  coordinates  of  Ro(ei)  as  shown  in  the  above  picture. 
The  coordinates  of  Rq^)  also  follow  from  trigonometry.  Thus 


Ro(e  i) 


COS0 

sin0 


,Ro(e 2) 


—  sin0 

COS0 


Therefore,  from  Theorem  5.6, 


A 


cos  0  —  sin  0 
sin  0  cos  9 


We  can  also  prove  this  algebraically  without  the  use  of  the  above  picture.  The  definition  of  (cos  (0)  ,sin(0)) 
is  as  the  coordinates  of  the  point  of  Rg(e  1).  Now  the  point  of  the  vector  <?2  is  exactly  k/2  further  along  the 
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unit  circle  from  the  point  of  e\,  and  therefore  after  rotation  through  an  angle  of  6  the  coordinates  v  and  y 
of  the  point  of  Rq  (ei)  are  given  by 

(x,y)  —  (cos(0  +  7r/2),sin(0  +  Jt/ 2))  =  (—  sin  0, cos  0) 


* 

Consider  the  following  example. 


Example  5.23:  Rotation  in  M2 

Letl 

x  — 

:  R." 

2 

1  ' 

-2 

-  — >  M2  denote  rotation  through  k/2.  Find  the  matrix  of  Rk.  Then,  find  R*(x)  where 

Solution.  By  Theorem  5.22,  the  matrix  of  R*  is  given  by 


cos (9) 

—  sin(0) 

cos(^/2) 

—  sin(^:/2) 

'  0  -1  ' 

sin(0) 

cos (0) 

sin(7r/2) 

cos(^:/2) 

1  0 

To  find  Rx(x),  we  multiply  the  matrix  of  Rn  by  x  as  follows 


'  0 

-1  ' 

1 ' 

"  2  ' 

1 

0 

-2 

1 

* 


We  now  look  at  an  example  of  a  linear  transformation  involving  two  angles. 


Example  5.24:  The  Rotation  Matrix  of  the  Sum  of  Two  Angles 


Find  the  matrix  of  the  linear  transformation  which  is  obtained  by  first  rotating  all  vectors  through  an 
angle  of  (j)  and  then  through  an  angle  0.  Hence  the  linear  transformation  rotates  all  vectors  through 
an  angle  of  9  +  (f). 


Solution.  Let  Rq+q  denote  the  linear  transformation  which  rotates  every  vector  through  an  angle  of  6 +  (j). 
Then  to  obtain  Rq+q,  we  first  apply  R&  and  then  Rq  where  Rq  is  the  linear  transformation  which  rotates 
through  an  angle  of  (f)  and  Rq  is  the  linear  transformation  which  rotates  through  an  angle  of  0.  Denoting 
the  corresponding  matrices  by  Aq+q,  Aq,  and  Aq,  it  follows  that  for  every  u 

Re+<j>  (u)  =  Ae+(j)u  =  AqAqu  =  RqRq  (u) 


Notice  the  order  of  the  matrices  here! 
Consequently,  you  must  have 


Ae+<j) 


cos  (0  +  0)  —sin  (0  +  tj)) 
sin  (0  +  0)  cos  (0  +  0) 
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cos  0 

—  sin0 

COS  0 

—  sin0 

sin0 

cos  0 

sin0 

COS  0 

—  AgA(p 


The  usual  matrix  multiplication  yields 


Ao+ij) 


cos  (0  +  0)  —sin  (0  +  0) 
sin  (0  +  0)  cos  (0  +  0) 

cos  0  cos  0  —  sin  0  sin  0  —  cos  0  sin  0  —  sin  0  cos  0 
sin  0  cos  0  +  cos  0  sin  0  cos  0  cos  0  —  sin  0  sin  0 

A qA<i > 


Don’t  these  look  familiar?  They  are  the  usual  trigonometric  identities  for  the  sum  of  two  angles  derived 
here  using  linear  algebra  concepts. 

* 

Here  we  have  focused  on  rotations  in  two  dimensions.  However,  you  can  consider  rotations  and  other 
geometric  concepts  in  any  number  of  dimensions.  This  is  one  of  the  major  advantages  of  linear  algebra. 
You  can  break  down  a  difficult  geometrical  procedure  into  small  steps,  each  corresponding  to  multiplica¬ 
tion  by  an  appropriate  matrix.  Then  by  multiplying  the  matrices,  you  can  obtain  a  single  matrix  which  can 
give  you  numerical  information  on  the  results  of  applying  the  given  sequence  of  simple  procedures. 

Linear  transformations  which  reflect  vectors  across  a  line  are  a  second  important  type  of  transforma¬ 
tions  in  R2.  Consider  the  following  theorem. 


Theorem  5.25:  Reflection 

Let  Qm  :  R2  ->  R2  be  a  linear  transformation 
the  matrix  of  Qm  is  given  by 

1 

1  +m2 

n  given  by  re  (lectin 

1  —  nr  2m 

2m  nr  —  1 

g  vectors  over  the  line  y  —  ml.  Then 

Consider  the  following  example. 


Example  5.26:  Reflection  in  R2 

Let  Qi  '■  R2  -+  R2  denote  reflection  over  the  li 
the  matrix  of  Q2.  Then ,  find  Q2  (1)  where  x  — 

ney  — 
1  ' 
-2 

21.  Then  Q2  is  a  linear  transformation.  Find 

Solution.  By  Theorem  5.25,  the  matrix  of  Q2  is  given  by 


1 

1  —  nr 

2m 

1 

r  1  -  (2)2 

2(2) 

1 

1 

1 

U> 

00 

_ 1 

1  +m2 

2m 

m2  —  1 

l  +  (2)2 

2(2) 

(2)2-l  J 

5 

8  3 

To  find  Q2(x)  we  multiply  x  by  the  matrix  of  Q2  as  follows: 


1 

'-38' 

1  ' 

r  19  1 

5 

5 

8  3 

-2 

2 

5 
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* 


Consider  the  following  example  which  incorporates  a  reflection  as  well  as  a  rotation  of  vectors. 


Example  5.27:  Rotation  Followed  by  a  Reflection 


Find  the  matrix  of  the  linear  transformation  which  is  obtained  by  first  rotating  all  vectors  through 
an  angle  of  n/6  and  then  reflecting  through  the  x  axis. 


Solution.  By  Theorem  5.22,  the  matrix  of  the  transformation  which  involves  rotating  through  an  angle  of 
k/6  is 


cos(7r/6)  —  sin(7r/6) 

-r 

sin(7r/6)  cos(7r/6) 

I  1^3 

Reflecting  across  the  x  axis  is  the  same  action  as  reflecting  vectors  over  the  line  y  —  nix  with  m  —  0. 
By  Theorem  5.25,  the  matrix  for  the  transformation  which  reflects  all  vectors  through  the  x  axis  is 


1 

1  —  m2 

2m 

1 

i  —  (o)2 

2(0) 

'  1  O' 

1  +  m2 

2m 

m2  —  1 

1  +  (0)2 

2(0) 

(0)2  -  1  J 

0  -1 

Therefore,  the  matrix  of  the  linear  transformation  which  first  rotates  through  k/ 6  and  then  reflects 
through  the  x  axis  is  given  by 


r  i  o  i 

-^3  -r 

■  *V3  -r 

o 

1 

k  kV3 

4  4V3 

* 


Exercises 


Exercise  5.4.1  Find  the  matrix  for  the  linear  transformation  which  rotates  every  vector  in  M2  through  an 
angle  ofn/3. 

Exercise  5.4.2  Find  the  matrix  for  the  linear  transformation  which  rotates  every  vector  in  M2  through  an 
angle  of%/4. 

Exercise  5.4.3  Find  the  matrix  for  the  linear  transformation  which  rotates  every  vector  in  M2  through  an 
angle  of—Tt/3. 

Exercise  5.4.4  Find  the  matrix  for  the  linear  transformation  which  rotates  every  vector  in  M2  through  an 
angle  of  2tt /3. 
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Exercise  5.4.5  Find  the  matrix  for  the  linear  transformation  which  rotates  every  vector  in  R2  through  an 
angle  of  nj  1 2.  Hint:  Note  that  k/\2  —  k/3  —  k/A. 

Exercise  5.4.6  Find  the  matrix  for  the  linear  transformation  which  rotates  every  vector  in  R2  through  an 
angle  of  In/ 3  and  then  reflects  across  the  x  axis. 

Exercise  5.4.7  Find  the  matrix  for  the  linear  transformation  which  rotates  every  vector  in  R2  through  an 
angle  of  k/3  and  then  reflects  across  the  x  axis. 

Exercise  5.4.8  Find  the  matrix  for  the  linear  transformation  which  rotates  every  vector  in  R2  through  an 
angle  of  7T / 4  and  then  reflects  across  the  x  axis. 

Exercise  5.4.9  Find  the  matrix  for  the  linear  transformation  which  rotates  every  vector  in  R2  through  an 
angle  of  7T / 6  and  then  reflects  across  the  x  axis  followed  by  a  reflection  across  the  y  axis. 

Exercise  5.4.10  Find  the  matrix  for  the  linear  transformation  which  reflects  every  vector  in  R2  across  the 
x  axis  and  then  rotates  every  vector  through  an  angle  of  k/A. 

Exercise  5.4.11  Find  the  matrix  for  the  linear  transformation  which  reflects  every  vector  in  R2  across  the 
y  axis  and  then  rotates  every  vector  through  an  angle  of  71  /A. 

Exercise  5.4.12  Find  the  matrix  for  the  linear  transformation  which  reflects  every  vector  in  R2  across  the 
x  axis  and  then  rotates  every  vector  through  an  angle  oJ'k/6. 

Exercise  5.4.13  Find  the  matrix  for  the  linear  transformation  which  reflects  every  vector  in  R2  across  the 
y  axis  and  then  rotates  every  vector  through  an  angle  of  K / 6. 

Exercise  5.4.14  Find  the  matrix  for  the  linear  transformation  which  rotates  every  vector  in  R2  through 
an  angle  of  5k/ 12.  Hint:  Note  that  5k /\2  =  2tt / 3  —  k/A. 

Exercise  5.4.15  Find  the  matrix  of  the  linear  transformation  which  rotates  every  vector  in  R3  counter 
clockwise  about  the  z.  axis  when  viewed  from  the  positive  z.  axis  through  an  angle  of  30°  and  then  reflects 
through  the  xy  plane. 


Exercise  5.4.16  Let  u  — 


a 

b 


be  a  unit  vector  in  R_.  Find  the  matrix  which  reflects  all  vectors  across 


this  vector,  as  shown  in  the  following  picture. 


Hint:  Notice  that 


a 

b 


axis.  Finally  rotate  through  9. 


cos  6 
sin0 


for  some  9.  First  rotate  through  —9.  Next  reflect  through  the  x 
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5.5  One  to  One  and  Onto  Transformations 


Let  T  :  M”  (->•  W"  be  a  linear  transformation.  We  define  the  range  or  image  of  T  as  the  set  of  vectors 
of  R"7  which  are  of  the  form  T  (3c)  (equivalently,  Ax)  for  some  x  6  R'7.  It  is  common  to  write  TW\  T  (R77), 
or  Im  (T)  to  denote  these  vectors. 


Proof.  This  follows  from  the  definition  of  matrix  multiplication.  4|k 

This  section  is  devoted  to  studying  two  important  characterizations  of  linear  transformations,  called 
one  to  one  and  onto.  We  define  them  now. 


The  second  important  characterization  is  called  onto. 


Definition  5.30:  Onto 


Let  T  :  R"  (-)■  R"7  be  a  linear  transformation.  Then  T  is  called  onto  if  whenever X2  € 
3ci  G  R'7  such  that  T  (x i)  —X2- 


there  exists 
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We  often  call  a  linear  transformation  which  is  one-to-one  an  injection.  Similarly,  a  linear  transforma¬ 
tion  which  is  onto  is  often  called  a  surjection. 

The  following  proposition  is  an  important  result. 


Proof.  We  need  to  prove  two  things  here.  First,  we  will  prove  that  if  T  is  one  to  one,  then  T(x)  =  0  implies 
that  x  —  0.  Second,  we  will  show  that  if  T  (x)  —  0  implies  that  x  =  0.  then  it  follows  that  T  is  one  to  one. 
Recall  that  a  linear  transformation  has  the  property  that  T (0)  =  0. 

Suppose  first  that  T  is  one  to  one  and  consider  T (0) . 

r(o)  =  r(o+o)  =  r(o)  +  r(o) 

and  so,  adding  the  additive  inverse  of  T (0)  to  both  sides,  one  sees  that  T (0)  =0.  If  T (3c)  —  0  it  must  be 
the  case  that  x  =  0  because  it  was  just  shown  that  T(0)  —  0  and  T  is  assumed  to  be  one  to  one. 

Now  assume  that  if  T (3c)  =  0,  then  it  follows  that  3c  =  0.  If  T(y)  =  T(u),  then 

T  (y)  —  T  (u)  =  T  (y  —  u)  =0 

which  shows  that  v  —  u  —  0.  In  other  words,  v  —  u,  and  T  is  one  to  one.  4|k 

Note  that  this  proposition  says  that  if  A  —  [  A\  ■■  ■  An  ]  then  A  is  one  to  one  if  and  only  if  whenever 

n 

0  CkAk 
k=  1 


it  follows  that  each  scalar  Ck  —  0. 

We  will  now  take  a  look  at  an  example  of  a  one  to  one  and  onto  linear  transformation. 


Example  5.32:  A  One  to  One  and  Onto  Linear  Transformation 


Suppose 


X 

'  i  r 

X 

_  y  _ 

1  2 

_  y  _ 

Then,  T  :  M2  — >  M2  is  a  linear  transformation.  Is  T  onto ?  Is  it  one  to  one? 


Solution.  Recall  that  because  T  can  be  expressed  as  matrix  multiplication,  we  know  that  T  is  a  linear 


transformation.  We  will  start  by  looking  at  onto.  So  suppose 


a 

b 


Does  there  exist 


x 

y 


such  that  T 


X 

a 

_  y  _ 

b 

?  If  so,  then  since 


a 

b 


is  an  arbitrary  vector  in  M  ,  it  will  follow  that  T  is  onto. 
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This  question  is  familiar  to  you.  It  is  asking  whether  there  is  a  solution  to  the  equation 


'  1 

1 ' 

X 

a 

1 

2 

.  y  _ 

b 

This  is  the  same  thing  as  asking  for  a  solution  to  the  following  system  of  equations. 


x  +  y  —  a 
x  +  2y  =  b 


Set  up  the  augmented  matrix  and  row  reduce. 


'  1 

1 

a 

-A 

'  1 

0 

2a  — b 

1 

2 

b 

0 

1 

b  —  a 

(5.4) 


You  can  see  from  this  point  that  the  system  has  a  solution.  Therefore,  we  have  shown  that  for  any  a,  b, 


there  is  a 


X 

such  that  T 

X 

_ 

a 

y 

y 

b 

.  Thus  T  is  onto. 


Now  we  want  to  know  if  T  is  one  to  one.  By  Proposition  5.31  it  is  enough  to  show  that  Ax  =  0  implies 
x  —  0.  Consider  the  system  Ax  =  0  given  by: 


'  1  1  ' 

X 

'  0  ' 

1  2 

.  y  _ 

0 

This  is  the  same  as  the  system  given  by 


x  +  y  =  0 
x  +  2y  —  0 

We  need  to  show  that  the  solution  to  this  system  is  x  =  0  and  y  =  0.  By  setting  up  the  augmented 
matrix  and  row  reducing,  we  end  up  with 


- 1 

O 

i 

O 

0  1 

o 

1 _ 

This  tells  us  that  x  =  0  and  y  —  0.  Returning  to  the  original  system,  this  says  that  if 


then 


'  i  r 

X 

'  0  ' 

1  2 

.  y 

0 

X 

1 

0  ' 

y 


0 


In  other  words,  Ax  =  0  implies  that  x  =  0.  By  Proposition  5.31,  A  is  one  to  one,  and  so  T  is  also  one  to 


one. 


We  also  could  have  seen  that  T  is  one  to  one  from  our  above  solution  for  onto.  By  looking  at  the  matrix 
given  by  5.4,  you  can  see  that  there  is  a  unique  solution  given  by  x  =  2a  —  b  and  y  —  b  —  a.  Therefore, 

2a  —  b 
b  —  a 


there  is  only  one  vector,  specifically 
5.29,  T  is  one  to  one. 


x 

y 


such  that  T 


X 

a 

.  y  _ 

b 

.  Hence  by  Definition 

* 
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Solution.  You  can  prove  that  T  is  in  fact  linear. 


To  show  that  T  is  onto,  let 


be  an  arbitrary  vector  in  M2.  Taking  the  vector 


x 

y 

0 

0 


e  Mr  we  have 


y 

x  +  0 

X 

0 

y  +  0 

_  y  _ 

u  0 

This  shows  that  T  is  onto. 

By  Proposition  5.31  T  is  one  to  one  if  and  only  if  T(x)  =0  implies  that  x  —  0.  Observe  that 

1 


-1 


There  exists  a  nonzero  vector  x  in  M4  such  that  T (x)  =  0.  It  follows  that  T  is  not  one  to  one.  4k 

The  above  examples  demonstrate  a  method  to  determine  if  a  linear  transformation  T  is  one  to  one  or 
onto.  It  turns  out  that  the  matrix  A  of  T  can  provide  this  information. 


0 

'  1  +  -1 ' 

'  0  ' 

0 

0  +  0 

0 

Theorem  5.34:  Matrix  of  a  One  to  One  or  Onto  Transformation 


Let  T  :  Mn  K »  Mm  be  a  linear  transformation  induced  by  the  m  x  n  matrix  A.  Then  T  is  one  to  one  if 
and  only  if  the  rank  of  A  is  n.  T  is  onto  if  and  only  if  the  rank  of  A  is  m. 


Consider  Example  5.33.  Above  we  showed  that  T  was  onto  but  not  one  to  one.  We  can  now  use  this 
theorem  to  determine  this  fact  about  T. 
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Solution.  Using  Theorem  5.34  we  can  show  that  T  is  onto  but  not  one  to  one  from  the  matrix  of  T .  Recall 
that  to  find  the  matrix  A  of  T,  we  apply  T  to  each  of  the  standard  basis  vectors  F,  of  M4.  The  result  is  the 


2x4  matrix  A  given  by 


10  0  1 
0  110 


Fortunately,  this  matrix  is  already  in  reduced  row-echelon  form.  The  rank  of  A  is  2.  Therefore  by  the 
above  theorem  T  is  onto  but  not  one  to  one.  4k 


Recall  that  if  S  and  T  are  linear  transformations,  we  can  discuss  their  composite  denoted  S  o  T.  The 
following  examines  what  happens  if  both  S  and  T  are  onto. 


Example  5.36:  Composite  of  Onto  Transformations 


Let  T  :  M.k  H >  R”  and  S  :  M"  i— *  Rm  be  linear  transformations.  IfT  and  S  are  onto,  then  S  o  T  is  onto. 


Solution.  Let  z  G  R"J.  Since  S  is  onto,  there  exists  a  vector  y  G  R'7  such  that  S(y)  —  z.  Furthermore,  since 
T  is  onto,  there  exists  a  vector  x  G  R^  such  that  T  (x)  =  y.  Thus 

z  =  S(y)=S(T(x))  =  (ST)(x), 

showing  that  for  each  z.  G  Rm  there  exists  and  x  G  M.k  such  that  (ST)  (x)  —  z.  Therefore,  SoT  is  onto. 

The  next  example  shows  the  same  concept  with  regards  to  one-to-one  transformations. 


Solution.  To  prove  that  So  T  is  one  to  one,  we  need  to  show  that  if  S(T(v))  —  0  it  follows  that  v  —  0. 
Suppose  that  S(T(v))  —  0.  Since  S  is  one  to  one,  it  follows  that  T (v)  =  0.  Similarly,  since  T  is  one  to  one, 
it  follows  that  v  =  0.  Hence  SoT  is  one  to  one.  4k 
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Exercises 


Exercise  5.5.1  Let  T  be  a  linear  transformation  given  by 


X 

'21' 

X 

_  y  _ 

0  1 

_  y  _ 

Is  T  one  to  one?  Is  T  onto? 

Exercise  5.5.2  Let  T  be  a  linear  transformation  given  by 

j  x 

y 


-1  2 
2  1 
1  4 


x 

y 


Is  T  one  to  one  ?  Is  T  onto  ? 

Exercise  5.5.3  Let  T  be  a  linear  transformation  given  by 


2  0  1 

1  2  -1 


X 

r 

T 

y 

z 

- 

x 

y 

z 


Is  T  one  to  one  ?  Is  T  onto  ? 

Exercise  5.5.4  Let  T  be  a  linear  transformation  given  by 


T 

X 

y 

— 

'  1  3  -5  ' 
2  0  2 

X 

y 

z 

2  4-6 

z 

Is  T  one  to  one?  Is  T  onto? 

Exercise  5.5.5  Give  an  example  of  a  3x2  matrix  with  the  property  that  the  linear  transformation  deter¬ 
mined  by  this  matrix  is  one  to  one  but  not  onto. 

Exercise  5.5.6  Suppose  A  is  an  m  x  n  matrix  in  which  m  <  n.  Suppose  also  that  the  rank  of  A  equals  m. 
Show  that  the  transformation  T  determined  by  A  maps  M"  onto  M"'.  Hint:  The  vectors  e\,  ■  ■  ■  ,em  occur  as 
columns  in  the  reduced  row-echelon  form  for  A. 

Exercise  5.5.7  Suppose  A  is  an  m  x  n  matrix  in  which  m  >  n.  Suppose  also  that  the  rank  of  A  equals  n. 
Show  that  A  is  one  to  one.  Hint:  If  not,  there  exists  a  vector,  x  such  that  Ax  —  0,  and  this  implies  at  least 
one  column  of  A  is  a  linear  combination  of  the  others.  Show  this  would  require  the  rank  to  be  less  than  n. 

Exercise  5.5.8  Explain  why  an  n  x  n  matrix  A  is  both  one  to  one  and  onto  if  and  only  if  its  rank  is  n. 
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5.6  Isomorphisms 


Outcomes 


A.  Determine  if  a  linear  transformation  is  an  isomorphism. 

B.  Determine  if  two  subspaces  of  R' 1  are  isomorphic. 

Recall  the  definition  of  a  linear  transformation.  Let  V  and  W  be  two  subspaces  of  RM  and  R'"  respec¬ 
tively.  A  mapping  T  :  V  — *  W  is  called  a  linear  transformation  or  linear  map  if  it  preserves  the  algebraic 
operations  of  addition  and  scalar  multiplication.  Specifically,  if  a,b  are  scalars  and  x,y  are  vectors, 

T  ( ax  +  by)  =  aT  (x)  +  bT (y) 

Consider  the  following  important  definition. 


Definition  5.38:  Isomorphism 


A  linear  map  T  is  called  an  isomorphism  if  the  following  two  conditions  are  satisfied. 

•  T  is  one  to  one.  That  is,  if  T(x)  —  T  (y) ,  then  x  —  y. 

•  T  is  onto.  That  is,  ifw  G  W,  there  exists  v  e  V  such  thatT(v )  =  w. 

Two  such  subspaces  which  have  an  isomorphism  as  described  above  are  said  to  be  isomorphic. 
Consider  the  following  example  of  an  isomorphism. 


Solution.  To  prove  that  T  is  an  isomorphism  we  must  show 

1.  T  is  a  linear  transformation; 

2.  T  is  one  to  one; 

3.  T  is  onto. 


We  proceed  as  follows. 
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1.  T  is  a  linear  transformation: 
Let  k,p  be  scalars. 


T  k 


x\ 

yi 


+p 


X2 

yi 


T 

T 


k 

kT 


kx  i  pX2 

ky  i  J  [  py2 

kx  i  +  /ri'2 
^Vl  +  Pyi 
(fc*i  +  px  2)  +  (Lvi  +  py2) 
(fc*i  +  p*2)  -  (£yi  +  pf’2 ) 
(kx  1  +  &yi )  +  {px2  +  p>’2) 
0*1  “W  +  (/**2— tfy2) 
fc*i+kyi  px2+py2 
kx\  —  ky  \  \  [  px2  -  py2 

xi+yi  1  ,  f  x2+y2 


X]  -y i 

*1 

>’i 


+  /W 


x2-y2 
x2 
yi 


Therefore  T  is  linear. 

2.  T  is  one  to  one: 

We  need  to  show  that  if  T(x)  =  0  for  a  vector  x  £  R2,  then  it  follows  that  x  —  0. 


( 

X 

)  = 

x  +  y 

'  0  ' 

_  y  _ 

J 

.  x~y . 

0 

This  provides  a  system  of  equations  given  by 


= 

0 

-y 

= 

0 

(i  if  a 

■  =  y 

=  0. 

X 

'  o ' 

y 

0 

You  can  verify  that  the  solution  to  this  system  if  x  —  y  —  0.  Therefore 

x 

and  T  is  one  to  one. 

3.  T  is  onto: 

Let  a,b  be  scalars.  We  want  to  check  if  there  is  always  a  solution  to 


( 

X 

)  = 

x  +  y 

a 

_  y  _ 

J 

.  x~y . 

b 

This  can  be  represented  as  the  system  of  equations 

x  +  y  —  a 


T  A 

Let  x  — 

y 
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x  —  y  =  b 

Setting  up  the  augmented  matrix  and  row  reducing  gives 


'  1 

1 

a 

'  1 

0 

a+b 

2 

1 

-1 

b 

-A  • 

•  -A 

0 

1 

a—b 

2 

This  has  a  solution  for  all  a,b  and  therefore  T  is  onto. 

Therefore  T  is  an  isomorphism.  4k 

An  important  property  of  isomorphisms  is  that  its  inverse  is  also  an  isomorphism. 


Proof.  Let  T  be  an  isomorphism.  Since  T  is  onto,  a  typical  vector  in  W  is  of  the  form  T  (v)  where  v  e  V. 
Consider  then  for  a,b  scalars, 

T  1  (aT(vi)  +  bT(y2)) 

where  vi,v2  G  V.  Is  this  equal  to 

aT  1  (T(vi))  +  bT~l  (T(v2))  =  av\  +  &v2? 

Since  T  is  one  to  one,  this  will  be  so  if 

T  (av i  +bv2)  =  T  (T~l  (aT(v\)  +  bT(v2)))  =  aT(yi) +bT(v2). 

However,  the  above  statement  is  just  the  condition  that  T  is  a  linear  map.  Thus  T  1  is  indeed  a  linear  map. 
If  v  G  V  is  given,  then  v  =  T1  (T(v))  and  so  T~l  is  onto.  If  T_1(v)  =  0,  then 

V=T(T~\v))  =T{  0)  =0 

and  so  T  1  is  one  to  one.  4 

Another  important  result  is  that  the  composition  of  multiple  isomorphisms  is  also  an  isomorphism. 


Proposition  5.41:  Composition  of  Isomorphisms 


Let  T  :  V  — *  W  and  S  :  W  — »  Z  be  isomorphisms  where  V,W,Z  are  subspaces  of  E”.  Then  SoT 
defined  by  (SoT)  (v)  —S(T(v))  is  also  an  isomorphism. 

Proof.  Suppose  T  :V  — )•  W  and  S  :  W  — >  Z  are  isomorphisms.  Why  is  So  T  a  linear  map?  For  a,b  scalars, 

So  T  (avi  +b(v2))  =  S(T(avi  +bv2))  —  S(aTvi  +bTv2) 

=  aS(Tvi)  +  bS(Tv2)=a(SoT)(vi)  +  b(SoT)(v2) 
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Hence  S o  T  is  a  linear  map.  If  (So  T)  (v)  =  0,  then  S(T  (v))  =0  and  it  follows  that  T(v)  =0  and  hence 
by  this  lemma  again,  v  —  0.  Thus  S  o  T  is  one  to  one.  It  remains  to  verify  that  it  is  onto.  Let  z  G  Z.  Then 
since  S  is  onto,  there  exists  w  G  W  such  that  S(w)  =  z.  Also,  since  T  is  onto,  there  exists  v  G  V  such  that 
T (v)  =  w.  It  follows  that  S  (T  (v))  =  z  and  so  S  o  T  is  also  onto.  4b 

Consider  two  subspaces  V  and  W,  and  suppose  there  exists  an  isomorphism  mapping  one  to  the  other. 
In  this  way  the  two  subspaces  are  related,  which  we  can  write  as  V  ~  W.  Then  the  previous  two  propo¬ 
sitions  together  claim  that  ~  is  an  equivalence  relation.  That  is:  ~  satisfies  the  following  conditions: 


.  \z  ~  V 

•  If  V  ~  W,  it  follows  that  W  ~  V 

•  If  V  ~  IT  and  IT  ~  Z,  then  V  ~  Z 

We  leave  the  verification  of  these  conditions  as  an  exercise. 
Consider  the  following  example. 


Solution.  The  reason  for  this  is  that,  since  A  is  invertible,  the  only  vector  it  sends  to  0  is  the  zero  vector. 
Hence  if  A  (3c)  =  A(y),  then  A  (3c  —  y)  =  0  and  sox  —  y.  It  is  onto  because  if  y  e  M",A  (A_1(y))  =  (AA_1)  (y) 

=  y-  4> 

In  fact,  all  isomorphisms  from  M'!  to  R”  can  be  expressed  as  T (x)  =  A (3c)  where  A  is  an  invertible  n  x  n 
matrix.  One  simply  considers  the  matrix  whose  ith  column  is  TeL. 

Recall  that  a  basis  of  a  subspace  V  is  a  set  of  linearly  independent  vectors  which  span  V.  The  following 
fundamental  lemma  describes  the  relation  between  bases  and  isomorphisms. 


Lemma  5.43:  Mapping  Bases 


Let  T  :V  — >•  W  be  a  linear  transformation  where  V,W  are  subspaces  of  R".  IfT  is  one  to  one,  then 
it  has  the  property  that  if  {u\,-  ■  ■  ,Uk}  is  linearly  independent,  so  is  {T(u\),-  ■  ■ ,  T(i4)}. 

More  generally,  T  is  an  isomorphism  if  and  only  if  whenever  { v\ ,  •  •  • ,  vn}  is  a  basis  for  V,  it  follows 
that  {T(vi),- ■  ■  ,r(v„)}  is  a  basis  forW. 


Proof.  First  suppose  that  T  is  a  linear  transformation  and  is  one  to  one  and  {u\  ,•••,%}  is  linearly  inde¬ 
pendent.  It  is  required  to  show  that  {T{u\),  -  ■  ■  ,T (%)}  is  also  linearly  independent.  Suppose  then  that 

k 

(««•)  =  o 

(=i 
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Then,  since  T  is  linear, 


Since  T  is  one  to  one,  it  follows  that 

n 

Y  c&i  =  0 

i=  1 


Now  the  fact  that  {u\,  ■  ■  ■ ,  un }  is  linearly  independent  implies  that  each  Ci  =  0.  Hence  {T(u\),---  ,T(un )} 
is  linearly  independent. 

Now  suppose  that  T  is  an  isomorphism  and  {vi,---  ,vn}  is  a  basis  for  V.  It  was  just  shown  that 
{r(vi),---  ,T(yn)}  is  linearly  independent.  It  remains  to  verify  that  span{T(vi), •  •  ■  ,T(vn)}  =  W.  If 
weW,  then  since  T  is  onto  there  exists  v  G  V  such  that  T(v)  =  vv.  Since  {vi,  •  ■  •  ,vn}  is  a  basis,  it  follows 
that  there  exists  scalars  {c,-}”=1  such  that 

n 

Y  C'L  =  ';- 

(=i 


Hence, 


w  =  T(v) 


n 


YCiT&) 

i=  1 


It  follows  that  spanjr (vi ),  -  -  -  ,T{vn)}  —  W  showing  that  this  set  of  vectors  is  a  basis  for  W. 

Next  suppose  that  T  is  a  linear  transformation  which  takes  a  basis  to  a  basis.  This  means  that  if 
{vi,---  ,v„}  is  a  basis  for  V,  it  follows  {r(v1),---  ,T(vn)}  is  a  basis  for  W.  Then  if  w  G  W,  there  exist 
scalars  q  such  that  w  —  £”=1  c,T(v,-)  =  T  (E”=ic'L')  showing  that  T  is  onto.  If  T  (L/L|  cTi)  =  0  then 
£”=1c,T(v/)  =  0  and  since  the  vectors  (r(vi),---  ,T(vn)}  are  linearly  independent,  it  follows  that  each 
Cj  =  0.  Since  E”=i  is  a  typical  vector  in  V,  this  has  shown  that  if  T(v)  =  0  then  v  —  0  and  so  7  is  also 
one  to  one.  Thus  T  is  an  isomorphism.  ^ 


The  following  theorem  illustrates  a  very  useful  idea  for  defining  an  isomorphism.  Basically,  if  you 
know  what  it  does  to  a  basis,  then  you  can  construct  the  isomorphism. 


Proof.  Suppose  first  that  these  two  subspaces  have  the  same  dimension.  Let  a  basis  for  V  be  {vi,  •  •  • ,  v„} 
and  let  a  basis  for  W  be  {v?i,  •  •  •  ,wn}.  Now  define  T  as  follows. 


T(vi)  =W{ 
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for  Y!i=\  crfi  an  arbitrary  vector  of  V, 

(n  \  n  n 

Y  c$i  =  Y  c‘T^‘ =  Y  c^’‘- 

i=  1  /  i=  1  i=  1 

It  is  necessary  to  verify  that  this  is  well  defined.  Suppose  then  that 

n  n 

Y  c&  =  Y  £$i 

i=  1  i=l 

Then 

E  (Cf  -  Cj)  Vi  =  0 
/=  1 

and  since  {vi,  •  •  • ,  v„}  is  a  basis,  c/  =  c,  for  each  i.  Hence 

n  n 

Y  Ci™i  =  ^ 

7=1  7=1 

and  so  the  mapping  is  well  defined.  Also  if  a,b  are  scalars, 

T  (aY  civi  +  bY  c/vA  =  T  (y  (aci  +  bci)  vA  =  Y  (ac*  +  b£i) 


Wi 


i=  1 


i=l 


<i=  1 


i=  1 


=  a 


’  £  C/Wi  +  b  Y  £i Wi 

7=1  7=1 

=  ciT  (y,  +bT  [Y 


.7=1 


o'=l 


Thus  T  is  a  linear  transformation. 
Now  if 


T  52  c'A  =  52  Ci™i  = 


,  i=  1 


z=  1 


then  since  the  {w\,  ■  ■■  ,wn}  are  independent,  each  c,  =  0  and  so  Y!l=\  civi  =  0  also.  Hence  T  is  one  to  one. 
If  Y!i=\  ciWi  is  a  vector  in  W,  then  it  equals 


7=1  \7=1  / 

showing  that  T  is  also  onto.  Hence  7  is  an  isomorphism  and  so  V  and  W  are  isomorphic. 

Next  suppose  T  :  V  H >  W  is  an  isomorphism,  so  these  two  subspaces  are  isomorphic.  Then  for 
{vi,---  ,vn}  a  basis  for  V,  it  follows  that  a  basis  for  W  is  {T'(vi),---  ,T(vn)}  showing  that  the  two  sub¬ 
spaces  have  the  same  dimension. 

Now  suppose  the  two  subspaces  have  the  same  dimension.  Consider  the  three  claimed  equivalences. 

First  consider  the  claim  that  1.)  =>■  2.).  If  T  is  one  to  one  and  if  {vi,--  - , vn }  is  a  basis  for  V,  then 
(r(vi),---  ,r(v„)}  is  linearly  independent.  If  it  is  not  a  basis,  then  it  must  fail  to  span  W.  But  then 
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there  would  exist  w  ^  span{T(vi),  •  •  •  ,T(v„ )}  and  it  follows  that  (T'(vi),---  ,T(vn),w}  would  be  linearly 
independent  which  is  impossible  because  there  exists  a  basis  for  W  of  n  vectors. 

Hence  span{T(vi),-  •  •  ,T(vn)}  =  W  and  so  (r(vi),  -  •  •  ,T(vn)}  is  a  basis.  If  w  6  W,  there  exist  scalars 
Cj  such  that 

CiVi\ 

showing  that  T  is  onto.  This  shows  that  1.)  =>■  2.). 

Next  consider  the  claim  that  2.)  =>■  3.).  Since  2.)  holds,  it  follows  that  T  is  onto.  It  remains  to  verify 
that  T  is  one  to  one.  Since  T  is  onto,  there  exists  a  basis  of  the  form  { T (v;),  ■■  ■  ,T (v„)} .  Then  it  follows 
that  { vi,  •••,?„}  is  linearly  independent.  Suppose 

tcm  =  0 

1=1 

Then 

Ec,r(v,)  =  o 
i=  1 

Hence  each  c;  =  0  and  so,  {vi,  •  •  • ,  vn}  is  a  basis  for  V.  Now  it  follows  that  a  typical  vector  in  V  is  of  the 
form  YH=\  If  T  (£"=1  c;v,)  =  0,  it  follows  that 

X>,-nv,)=o 

1=1 

and  so,  since  {r(v;),  •  ■  ■ , T(vn)}  is  independent,  it  follows  each  c,-  =  0  and  hence  £"=1  c;v,-  =  0.  Thus  T  is 
one  to  one  as  well  as  onto  and  so  it  is  an  isomorphism. 

If  T  is  an  isomorphism,  it  is  both  one  to  one  and  onto  by  definition  so  3.)  implies  both  1.)  and  2.).  4|k 

Note  the  interesting  way  of  defining  a  linear  transformation  in  the  first  part  of  the  argument  by  describ¬ 
ing  what  it  does  to  a  basis  and  then  “extending  it  linearly”  to  the  entire  subspace. 


w  — 


tclT^)  =  T\t 


i=  1 


o=l 


Solution.  First  observe  that  these  subspaces  are  both  of  dimension  3  and  so  they  are  isomorphic  by  Theo¬ 
rem  5.44.  The  three  vectors  which  span  IT  are  easily  seen  to  be  linearly  independent  by  making  them  the 
columns  of  a  matrix  and  row  reducing  to  the  reduced  row-echelon  form. 
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You  can  exhibit  an  isomorphism  of  these  two  spaces  as  follows. 


'  1  ' 

'  0  ' 

'  1  ' 

T(e  ,)  = 

2 

1 

,T(h)  = 

1 

0 

,T(e 3)  = 

1 

2 

1 

1 

0 

and  extend  linearly.  Recall  that  the  matrix  of  this  linear  transformation  is  just  the  matrix  having  these 
vectors  as  columns.  Thus  the  matrix  of  this  isomorphism  is 

"10  1" 

2  1  1 

1  0  2 

_  1  1  0  _ 

You  should  check  that  multiplication  on  the  left  by  this  matrix  does  reproduce  the  claimed  effect  resulting 
from  an  application  by  T.  4k 

Consider  the  following  example. 


Solution.  First  note  that  the  vectors 


1 

0 

1 

1 

9 

1 

9 

1 

0 

1 

1 

are  indeed  a  basis  for  R3  as  can  be  seen  by  making  them  the  columns  of  a  matrix  and  using  the  reduced 
row-echelon  form. 


Now  recall  the  matrix  of  T  is  a  4  x  3  matrix  A  which  gives  the  same  effect  as  T .  Thus,  from  the  way 
we  multiply  matrices, 


1  0  1 
1  1  1 
0  1  1 


1  0  1 
2  1  1 
1  0  2 
1  1  0 


A 
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Hence, 


A  = 


1  0  1 
2  1  1 
1  0  2 
1  1  0 


1  0  1 
1  1  1 
0  1  1 


1  0  0 

0  2-1 
2  -1  1 

-1  2  -1 


Note  how  the  span  of  the  columns  of  this  new  matrix  must  be  the  same  as  the  span  of  the  vectors  defining 
W.  * 


This  idea  of  defining  a  linear  transformation  by  what  it  does  on  a  basis  works  for  linear  maps  which 
are  not  necessarily  isomorphisms. 


Solution.  Note  that  in  this  case,  the  three  vectors  which  span  W  are  not  linearly  independent.  Nevertheless 
the  above  procedure  will  still  work.  The  reasoning  is  the  same  as  before.  If  A  is  this  matrix,  then 


and  so 


A 


1  0  1 
1  1  1 
0  1  1 


1  0  1 
0  1  1 
1  0  1 
1  1  2 


A  = 


1  0  1 
0  1  1 
1  0  1 
1  1  2 


1  0  1 
1  1  1 
0  1  1 


1  0  0 
0  0  1 
1  0  0 
1  0  1 


The  columns  of  this  last  matrix  are  obviously  not  linearly  independent. 


* 
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Exercises 


Exercise  5.6.1  Let  V  and  W  be  subspaces  of  R"  and  R"1  respectively  and  let  T  :  V  — >  W  be  a  linear 
transformation.  Suppose  that  {Tv \ ,  •  •  •  ,  Tvr}  is  linearly  independent.  Show  that  it  must  be  the  case  that 
{vi,  •  •  •  ,vr}  is  also  linearly  independent. 


Exercise  5.6.2  Let 


V  =  span 


Let  Tx  =  Ax  where  A  is  the  matrix 


Give  a  basis  for  im  (T). 

Exercise  5.6.3  Let 


V  =  span 


Let  Tx  =  Ax  where  A  is  the  matrix 


1 

0 

1 

1 

1 

1 

1 

1 

2 

1 

? 

0 

0 

1 

1 

1 

1111 
0  110 
0  12  1 
1112 


1 

1 

1 

) 

0 

1 

4 

( 

0 

1 

? 

4 

1 

1 

1 

1 

1111 
0  110 
0  12  1 
1112 


Find  a  basis  for  im  (  T  ).  In  this  case,  the  original  vectors  do  not  form  an  independent  set. 


Exercise  5.6.4  If  {v i,  •  •  • , vr}  is  linearly  independent  and  T  is  a  one  to  one  linear  transformation,  show 
that  {Tv i,  •  •  • ,  Tvr}  is  also  linearly  independent.  Give  an  example  which  shows  that  ifT  is  only  linear,  it 
can  happen  that,  although  {vi ,  •  •  • ,  vr}  is  linearly  independent,  {  Tv\ ,  •  •  ■  ,  Tv,-}  is  not.  In  fact,  show  that  it 
can  happen  that  each  of  the  Tvj  equals  0. 

Exercise  5.6.5  Let  V  and  W  be  subspaces  of  R”  and  M"!  respectively  and  let  T  :  V  — >■  W  be  a  linear 
transformation.  Show  that  ifT  is  onto  W  and  if  {v  i,  •  •  •  ,v,}  is  a  basis  for  V ,  then  span  {Tv  i,  •  •  •  ,Tvr}  — 
W. 


Exercise  5.6.6  Define  T  :  M4  — >  R3  as  follows. 


Tx  = 


3  2  18 

2  2-26 
11-13 


x 
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Find  a  basis  for  im(T).  Also  find  a  basis  for  ker(T) . 
Exercise  5.6.7  Define  T  :  R3  — *  R3  as  follows. 


Tx  = 


1  2  0 
1  1  1 
0  1  1 


x 


where  on  the  right,  it  is  just  matrix  multiplication  of  the  vector  x  which  is  meant.  Explain  why  T  is  an 
isomorphism  o/R3  to  R3. 


Exercise  5.6.8  Suppose  T  :  R3  — >  R3  is  a  linear  transformation  given  by 

Tx  —  Ax 

where  A  is  a  3x3  matrix.  Show  that  T  is  an  isomorphism  if  and  only  if  A  is  invertible. 


Exercise  5.6.9  Suppose  T  :  R”  — >  R'”  is  a  linear  transformation  given  by 

Tx  =  Ax 

where  A  is  an  m  x  n  matrix.  Show  that  T  is  never  an  isomorphism  if  m  f  n.  In  particular,  show  that  if 
m  >  77,  T  cannot  be  onto  and  ifm  <  n ,  then  T  cannot  be  one  to  one. 


Exercise  5.6.10  Define  T  :  R2  — >■  R3  as  follows. 


Tx  = 


1  0 
1  1 
0  1 


x 


where  on  the  right,  it  is  just  matrix  multiplication  of  the  vector  x  which  is  meant.  Show  that  T  is  one  to 
one.  Next  let  W  —  im(T ) .  Show  that  T  is  an  isomorphism  o/R2  and  im  ( T ). 


Exercise  5.6.11  In  the  above  problem,  find  a  2x3  matrix  A  such  that  the  restriction  of  A  to  im  ( T )  gives 
the  same  result  as  T  1  on  im  ( T ).  Hint:  You  might  let  A  be  such  that 


A 


1 

1 

0 


1 

0 


0 

1 


0 

1 


now  find  another  vector  vGR3  such  that 


{ 

'  1 ' 

'  0  ' 

) 

) 

1 

1 

,V 

l 

0 

1 

I 

v 


0 

0 

1 


is  a  basis.  You  could  pick 
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for  example.  Explain  why  this  one  works  or  one  of  your  choice  works.  Then  you  could  define  Av  to  equal 
some  vector  in  M2.  Explain  why  there  will  be  more  than  one  such  matrix  A  which  will  deliver  the  inverse 
isomorphism  T  1  on  im  (  T). 


Exercise  5.6.12  Now  let  V  equal  span 


where 


{ 

"  1 ' 

'  0  ' 

) 

) 

0 

9 

1 

l 

l 

1 

1 

1 

and  let  T  :  V  — >  W  be  a  linear  transformation 


W  —  span 


f 

"  1 ' 

'  0  ' 

0 

1 

1 

9 

1 

l 

0 

1 

and 


1 

0 

1 


1 

0 

1 

0 


,T 


0 

1 

1 


0 

1 

1 

1 


Explain  why  T  is  an  isomorphism.  Determine  a  matrix  A  which,  when  multiplied  on  the  left  gives  the  same 
result  as  T  on  V  and  a  matrix  B  which  delivers  T  1  on  W.  Hint:  You  need  to  have 


1  0 
0  1 
1  1 


1  0 
0  1 
1  1 
0  1 


'  1  ' 

'  0  ' 

Now  enlarge 

0 

9 

1 

1 

1 

to  obtain  a  basis  for  R3.  You  could  add  in 


0 

0 

1 


for  example,  and  then  pick 


another  vector  in  R4  and  let  A 


0 

0 

1 


equal  this  other  vector.  Then  you  would  have 


1  0  0 
0  1  0 
1  1  1 


1  0  0 
0  1  0 
1  1  0 
0  1  1 


This  would  involve  picking  for  the  new  vector  in  R4  the  vector  [  0  0  0  1  ] 1  .  Then  you  could  find  A. 

You  can  do  something  similar  to  find  a  matrix  for  T  1  denoted  as  B. 
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5.7  The  Kernel  And  Image  Of  A  Linear  Map 


Outcomes 


A.  Describe  the  kernel  and  image  of  a  linear  transformation,  and  find  a  basis  for  each. 


In  this  section  we  will  consider  the  case  where  the  linear  transformation  is  not  necessarily  an  isomor¬ 
phism.  First  consider  the  following  important  definition. 


It  follows  that  i m  ( T )  and  ker(T)  are  subspaces  of  IT  and  V  respectively. 


Proposition  5.49:  Kernel  and  Image  as  Subspaces 


Let  V  ,W  be  subspaces  of  M”  and  let  T  :  V  — >■  W  be  a  linear  transformation.  Then  ker(T)  is  a 
subspace  ofV  and  im(T )  is  a  subspace  ofW. 


Proof.  First  consider  kerf 7’) .  It  is  necessary  to  show  that  if  vi,V2  are  vectors  in  ker(T)  and  if  a,b  are 
scalars,  then  av\  +  bv 2  is  also  in  ker  (T) .  But 

T  (av  1  +bv 2)  —  aT(v  1)  +bT(v 2)  —  a0  +  b0  —  0 

Thus  ker  (T)  is  a  subspace  of  V. 

Next  suppose  T (vi),  T (F2)  are  two  vectors  in  im (T) .  Then  if  a,b  are  scalars, 

aT(v2)  +  bT(v2)  =  T  ( av\+bv2 ) 

and  this  last  vector  is  in  im  (T)  by  definition.  4* 

We  will  now  examine  how  to  find  the  kernel  and  image  of  a  linear  transformation  and  describe  the 
basis  of  each. 
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Solution.  You  can  verify  that  T  is  a  linear  transformation. 

First  we  will  find  a  basis  for  ker(T).  To  do  so,  we  want  to  find  a  way  to  describe  all  vectors  xel4 

be  such  a  vector.  Then 


a 

b 

a  —  b 

'  0  ' 

c 

c  -\-  d 

0 

d 

The  values  of  a,  b,  c,  d  that  make  this  true  are  given  by  solutions  to  the  system 


such  that  T(x)  =0.  Let  x 


a 

b 

c 

d 


a  —  b  —  0 
c  +  d  —  0 


The  solution  to  this  system  is  a  —  s,b  —  s,c  —  t,d 
follows. 


ker(T) 


—t  where  s,t  are  scalars.  We  can  describe  ker(T)  as 


Notice  that  this  set  is  linearly  independent  and  therefore  forms  a  basis  for  ker(T). 
We  move  on  to  finding  a  basis  for  im(T).  We  can  write  the  image  of  T  as 


im(r) 


a  —  b 
c  +  d 


We  can  write  this  in  the  form 


This  set  is  clearly  not  linearly  independent.  By  removing  unnecessary  vectors  from  the  set  we  can  create 
a  linearly  independent  set  with  the  same  span.  This  gives  a  basis  for  i  m ( T )  as 

im(T)  =  span 


1 

-1 

0 

0 

0 

0 

5 

1 

? 

1 
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* 

Recall  that  a  linear  transformation  T  is  called  one  to  one  if  and  only  if  T(x)  —  0  implies  x  =  0.  Using 
the  concept  of  kernel,  we  can  state  this  theorem  in  another  way. 


Theorem  5.51:  One  to  One  and  Kernel 


Let  T  be  a  linear  transformation  where  ker(T)  is  the  kernel  ofT.  Then  T  is  one  to  one  if  and  only 
if  ker(T)  consists  of  only  the  zero  vector. 


A  major  result  is  the  relation  between  the  dimension  of  the  kernel  and  dimension  of  the  image  of  a 
linear  transformation.  In  the  previous  example  ker(L)  had  dimension  2,  and  i m ( T )  also  had  dimension  of 
2.  Is  it  a  coincidence  that  the  dimension  of  M22  is  4  =  2  +  2?  Consider  the  following  theorem. 


Proof.  From  Proposition  5.49,  im(L)  is  a  subspace  of  W.  We  know  that  there  exists  a  basis  for  im(L), 
{r(vi),-»  ,T(vr)}.  Similarly,  there  is  a  basis  for  ker  (T)  ,{u\,  ■  ■  ■  ,us}.  Then  if  v  e  V,  there  exist  scalars 
Ci  such  that 

T(v)  =  £c,T(v,) 

i=  1 

Hence  T  (v  —  £[=1  c,-v,-)  =  0.  It  follows  that  v  —  £'=1  c,v;  is  in  ker  (T).  Hence  there  are  scalars  a,-  such  that 

r  s 

V-  Yj  Cd)‘  =  Y  CIIi'lJ 
i=  1  7=1 

Hence  v  =  £f=1  ctvL  +  Tfj=\  ajUj-  Since  v  is  arbitrary,  it  follows  that 

V  =  span  {i<i,  •  •  •  ,us,vi,  ■  ■  ■  ,vr) 

If  the  vectors  {«!,•••  ,vr}  are  linearly  independent,  then  it  will  follow  that  this  set  is  a  basis. 

Suppose  then  that 

r  s 

Y  cTi + Y  afij =  0 

1=1  7=1 


r 


5 


r 


Y  (vf) + Y  ajT  (u)j  =  Y CiT  (^i) 

i=  1  7=1  i=l 


0 


Apply  T  to  both  sides  to  obtain 
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Since  {T(vi),  •  •  • ,  T (vr)}  is  linearly  independent,  it  follows  that  each  c;  =  0.  Hence  Yfj=i  ajUj  —  0  and  so, 
since  the  {u\ ,  •  •  • ,  Us}  are  linearly  independent,  it  follows  that  each  aj  —  0  also.  Therefore  {u\ ,  •  •  • ,  us,  vi ,  •  •  • ,  vr} 
is  a  basis  for  V  and  so 

n  —  s  +  r  —  dim(ker(T))  +  dim(im(T)) 


* 


The  above  theorem  leads  to  the  next  corollary. 


This  follows  directly  from  the  fact  that  n  —  dim  (ker  ( T) )  +  dim  (im  ( T) ) . 
Consider  the  following  example. 


Solution.  Since  the  two  columns  of  the  above  matrix  are  linearly  independent,  we  conclude  that  dim(im(7'))  = 
2  and  therefore  dim(kcr(7))  —2  —  dim(im(T))  =  2  —  2  =  0  by  Theorem  5.52.  Then  by  Theorem  5.51  it 
follows  that  T  is  one  to  one. 

Thus  T  is  an  isomorphism  of  M2  and  the  two  dimensional  subspace  of  M3  which  is  the  span  of  the 
columns  of  the  given  matrix.  Now  in  particular, 


Thus 


'  1  ' 

'  0  " 

T(e  ,)  = 

1 

.  T(h)  = 

0 

0 

1 

"  1 ' 

'  0  " 

1 

=  elt  T~l 

0 

0 

1 

'  0  " 

1 

0 


=  e\ 


=  ei 


Extend  T  1  to  all  of  M3  by  defining 
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Notice  that  the  vectors 


1 

1 

0 


0 

0 

1 


0 

1 

0 


are  linearly  independent  so  T  1  can  be  extended  linearly  to  yield  a  linear  transformation  defined  on 
The  matrix  of  T  1  denoted  as  A  needs  to  satisfy 


1  0  0 
1  0  1 
0  1  0 


1  0  1 
0  1  0 


and  so 


A  = 


1  0  1 
0  1  0 


1  0  0 
1  0  1 
0  1  0 


0  1  0 
0  0  1 


Note  that 


0  1  0 
0  0  1 

0  1  0 
0  0  1 


1 

1 

0 

0 

0 

1 


1 

0 

0 

1 


so  the  restriction  to  V  of  matrix  multiplication  by  this  matrix  yields  T  1 . 


Exercises 


Exercise  5.7.1  Let  V  —  M3  and  let 

W  =  span  ( S ) ,  where  S 


2 

2 

-2 


Find  a  basis  ofW  consisting  of  vectors  in  S. 

Exercise  5.7.2  Let  T  be  a  linear  transformation  given  by 

T 

Find  a  basis  for  ker  (T)  and  im{T ). 

Exercise  5.7.3  Let  T  be  a  linear  transformation  given  by 

T 


X 

'  l 

l  ' 

X 

_  y  _ 

l 

l 

_  y  _ 

X 

'10' 

-V 

_  y  _ 

i  i 

.  y  _ 
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Find  a  basis  for  ker  (T)  and  im{T). 
Exercise  5.7.4  Let  V  —  R3  and  let 


W  —  span 


Extend  this  basis  ofW  to  a  basis  ofV. 

Exercise  5.7.5  Let  T  be  a  linear  transformation  given  by 


X 

X 

Ill' 

y 

— 

1  1  1 

y 

z 

z 

What  is  dim(ker  (T))  ? 


5.8  The  Matrix  of  a  Linear  Transformation  II 


We  begin  this  section  with  an  important  lemma. 


Lemma  5.55:  Mapping  of  a  Basis 


Let  T  :  R77  i— y  R77  be  an  isomorphism.  Then  T  maps  any  basis  of  R'7  to  another  basis  for  R". 
Conversely,  ifT  :  R"  i— >  R'7  is  a  linear  transformation  which  maps  a  basis  of  R'7  to  another  basis  of 
Rn,  then  it  is  an  isomorphism. 


Proof.  First,  suppose  T  :  R'7  H >  R'7  is  a  linear  transformation  which  is  one  to  one  and  onto.  Let  {vi,  •  •  •  ,vn} 
be  a  basis  for  R77.  We  wish  to  show  that  (L(vi),  •  •  • ,  T(vn)}  is  also  a  basis  for  R'7. 

First  consider  why  it  is  linearly  independent.  Suppose  Yf=\  akT(vk )  =  0.  Then  by  linearity  we  have 
T  (£k=  i  aF’k)  —  0  and  since  T  is  one  to  one,  it  follows  that  ££=1  afvk  —  0.  This  requires  that  each  %  =  0 
because  {vi ,  ■  •  • ,  is  independent,  and  it  follows  that  {T'(vi),-  •  ■  ,T(vn)}  is  linearly  independent. 

Next  take  w  G  R71.  Since  T  is  onto,  there  exists  v  G  R77  such  that  T(v)  —  w.  Since  {vi,  -  •  •  ,vn}  is  a  basis, 
in  particular  it  is  a  spanning  set  and  there  are  scalars  b^  such  that  T  (£'1=1  ^iCk)  —  T(v)  —  w.  Therefore 
w  =  E^=j bkT(vk)  which  is  in  the  span{L(vi), •  •  •  ,T(vn)}.  Therefore,  {T(v i),---  ,T(vn)}  is  a  basis  as 
claimed. 
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Suppose  now  that  T  :  R"  H *  R”  is  a  linear  transformation  such  that  T (v,)  =  vv,  where  {vi,  •  •  • ,  vn }  and 
{v?i,  •  •  • , wn}  are  two  bases  for  R". 

To  show  that  T  is  one  to  one,  let  T  (X!l=\  QV&)  =  0.  Then  1  ckT{vk )  =  E”=i  =  0.  It  follows 

that  each  c^  —  O  because  it  is  given  that  {rvi,  •  •  • ,  w„}  is  linearly  independent.  Hence  T  (EJt=i  QvO  =  0 
implies  that  Ejt=i  ck^k  =  0  and  so  T  is  one  to  one. 

To  show  that  T  is  onto,  let  w  be  an  arbitrary  vector  in  R".  This  vector  can  be  written  as  w  =  Y!k=\  dk^k  = 
T!k=i  dkT(vk)  —  T  (Y!k=  i  dkVk)  ■  Therefore,  T  is  also  onto.  4 

Consider  now  an  important  definition. 


Consider  the  following  example. 


Example  5.57:  Coordinate  Vector 

LetB=  j 

r 

1 

■  1  ■ 
0 

? 

■  -1  ■ 
1 

1 

j 

|>  be  a  basis  of  R2  and  let  x  = 

3  ' 

-1 

be  a  vector  in  M2.  Find  Cb(x). 

Solution.  First,  note  the  order  of  the  basis  is  important  so  label  the  vectors  in  the  basis  B  as 


B  = 


1 

0 


-1 

1 


{V1,V2} 


Now  we  need  to  find  a\,a,2  such  that  x  —  a\V\  +CI2V2,  that  is: 


'  1  ' 

'  -1  ' 

a\ 

0 

+  ci2 

1 

Solving  this  system  gives  a\ 
B  is 


2,a2  —  —1.  Therefore  the  coordinate  vector  of  x  with  respect  to  the  basis 


CB(x) 


a\ 

2  ' 

a2 

-1 

4 


Given  any  basis  B,  one  can  easily  verify  that  the  coordinate  function  is  actually  an  isomorphism. 
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We  now  discuss  the  main  result  of  this  section,  that  is  how  to  represent  a  linear  transformation  with 
respect  to  different  bases. 


Proof.  The  above  equation  5.5  can  be  represented  by  the  following  diagram. 


T 

Rn 

-> 

Rm 

Cbx  1 

o 

1  Cb 

Rn 

— >■ 

Mb2bx 

Rm  ' 

Since  C#,  is  an  isomorphism,  then  the  matrix  we  are  looking  for  is  the  matrix  of  the  linear  transforma¬ 
tion 

Cb2TC zl  :  R'1  ^  Mm. 

By  Theorem  5.6,  the  columns  are  given  by  the  image  of  the  standard  basis  {e\,e2,--- , en } .  But  since 
Cgj1  (<?,)  =  v,  we  readily  obtain  that 


and  this  completes  the  proof. 


Cb,TC^(22)  ■■■  CB,T^(e,) 
Cb2(T(vi))  CBi(T(v2 ))  ■■■  C„,[T(%))} 


* 


Consider  the  following  example. 
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Solution.  By  Theorem  5.59,  the  columns  of  Mb^bx  are  the  coordinate  vectors  of  T (vi),  T (v2)  with  respect 
to  B2. 

Since 

'  1 


0 


0 

1 


a  standard  calculation  yields 


'  0  ' 

=  (T) 

'  1  ' 

+  U) 

1  ' 

1 

V-J 

1 

v  v 

-1 

1 

2 

the  first  column  of  Mb2bx  is  1 

—  2 

The  second  column  is  found  in  a  similar  way.  We  have 

T 

and  with  respect  to  B2  calculate: 


-1 

1 


-0 


0 

1 


+  1 

.  We  thus  obtain 


Hence  the  second  column  of  Mb2bx  is  given  by 

Mb2Bi  — 

We  can  verify  that  this  is  the  correct  matrix  Mb2bx  on  the  specific  example 


-2  0 
-2  1 


v  — 


3 

-1 


First  applying  T  gives 


T(v)  =  T 


-1 

3 
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and  one  can  compute  that 


On  the  other  hand,  one  compute  Cb1  (v)  as 


and  finally  applying  Mb  jB2  gives 


-2  O' 

2  ' 

1  " 

_-2  !. 

-1 

-2 

as  above. 

We  see  that  the  same  vector  results  from  either  method,  as  suggested  by  Theorem  5.59.  4k 

If  the  bases  B\  and  lh  are  equal,  say  B,  then  we  write  Mb  instead  of  Mbb-  The  following  example 
illustrates  how  to  compute  such  a  matrix.  Note  that  this  is  what  we  did  earlier  when  we  considered  only 
B\  —  B2  to  be  the  standard  basis. 


Example  5.61:  Matrix  of  a  Linear  Transformation  with  respect  to  an  Arbitrary  Basis 


Consider  the  basis  B  o/'M3  given  by 


B  =  {v  i,v2,v3} 


And  let  T  :  R3  1— >  R3  be  the  linear  transformation  defined  on  B  as: 


1 

1 

1 

1 

-1 

0 

T 

0 

= 

-1 

,T 

1 

= 

2 

J 

1 

= 

1 

1 

1 

1 

-1 

0 

1 

1.  Find  the  matrix  Mb  ofT  relative  to  the  basis  B. 

2.  Then  find  the  usual  matrix  ofT  with  respect  to  the  standard  basis  of  R3. 


Solution. 

Equation  5.5  gives  CbT  =  MbCb,  and  thus  Mb  =  CbTCb1. 

Now  Cg(Vi)  =  e,-,  so  the  matrix  of  CB  1  (with  respect  to  the  standard  basis)  is  given  by 


[Cj‘Pl)  Cp(e2)  Cp(e 2)] 


1  1  -1 

0  1  1 

1  1  0 
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Moreover  the  matrix  of  TCB  1  is  given  by 


Thus 


[TCe'(e ,)  TCz'(e2)  TC^(e ,)] 


1  1  0 

-1  2  1 

1  -1  1 


MB  =CBTCBl  =  [CBl}~l[TCBl] 


1 

1 

-1 ' 

-1 

1 

1 

0  ' 

0 

1 

1 

-1 

2 

1 

1 

1 

0 

1 

-1 

1 

2  -5  1 
-1  4  0 

0  -2  1 


Consider  how  this  works.  Let  b  = 


bx 

b2 


be  an  arbitrary  vector  in  M3. 


Apply  Cg  1  to  b  to  get 


[  b3 

J 

'  1  ' 

"  1  ' 

"  -1  ' 

b\ 

0 

T  b2 

1 

+  b3 

1 

1 

1 

0 

Apply  T  to  this  linear  combination  to  obtain 


1  ' 

1  ' 

'  0  ' 

b\  +b2 

-1 

T  b2 

2 

+  b3 

1 

= 

-b\  +  2b2  +  b3 

1 

-1 

1 

bx-b2  +  b3 

Now  take  the  matrix  MB  of  the  transformation  (as  found  above)  and  multiply  it  by  b. 


2-51' 

b\ 

2bx  —  5b2  +  b3 

-1  4  0 

b2 

= 

— bx  +4Z?2 

0  -2  1 

.  b3  . 

—2b2  +  b3 

Is  this  the  coordinate  vector  of  the  above  relative  to  the  given  basis?  We  check  as  follows. 


(2bx-5b2  +  b3) 

'  1  ' 
0 

+  (-bx  +4  b2) 

"  1  ' 
1 

+  (-2^  +  63) 

'  -1  ' 
1 

1 

1 

0 

b\  -\-b2 

=  -bx+2b2  +  b3 

bx-b2  +  b3 

You  see  it  is  the  same  thing. 

Now  lets  find  the  matrix  of  T  with  respect  to  the  standard  basis.  Let  A  be  this  matrix.  That  is, 
multiplication  by  A  is  the  same  as  doing  T.  Thus 


"  1 

1 

-1  ' 

1 

1 

0  ' 

A 

0 

1 

1 

= 

-1 

2 

1 

1 

1 

0 

1 

-1 

1 
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Hence 


1  10' 

'  1  1  -1  ' 

-l 

0  0  1  ' 

-1  2  1 

0  1  1 

= 

2  3-3 

1  -1  1 

1  1  0 

-3  -2  4 

Of  course  this  is  a  very  different  matrix  than  the  matrix  of  the  linear  transformation  with  respect  to  the  non 
standard  basis.  4k 


Exercises 


Exercise  5.8.1  Let  B 

Cb(x). 

Exercise  5.8.2  Let  B 

in  R2.  Find Cb(x). 


2 

-1 


1 

-1 

2 


3 

2 


be  a  basis  o/M2  and  let  x  — 


5 

-7 


be  a  vector  in  R-.  Find 


2 

1 

2 


-1 

0 

2 


be  a  basis  of  R3  and  let  x  — 


5 

-1 

4 


be  a  vector 


Exercise  5.8.3  Let  T  :  R2  R2  be  a  linear  transformation  defined  by  T 
Consider  the  two  bases 

B\  =  {V1,V2> 

and 

B2  = 


( 

a 

\ 

a  +  b 

l 

b 

)- 

a  —  b 

1 

0 


-1 

1 


Find  the  matrix  Mb2,B\  ofT  with  respect  to  the  bases  B\  and  lh. 


5.9  The  General  Solution  of  a  Linear  System 


Outcomes 


A.  Use  linear  transformations  to  determine  the  particular  solution  and  general  solution  to  a  sys¬ 
tem  of  equations. 

B.  Find  the  kernel  of  a  linear  transformation. 
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Recall  the  definition  of  a  linear  transformation  discussed  above.  T  is  a  linear  transformation  if 
whenever  x,y  are  vectors  and  k,p  are  scalars, 

T  ( kx  +  py )  =  kT  (x)  +  pT  (y) 

Thus  linear  transformations  distribute  across  addition  and  pass  scalars  to  the  outside. 

It  turns  out  that  we  can  use  linear  transformations  to  solve  linear  systems  of  equations.  Indeed  given 
a  system  of  linear  equations  of  the  form  Ax  =  b,  one  may  rephrase  this  as  T(x)  —  b  where  T  is  the  linear 
transformation  7^  induced  by  the  coefficient  matrix  A.  With  this  in  mind  consider  the  following  definition. 


Definition  5.62:  Particular  Solution  of  a  System  of  Equations 


Suppose  a  linear  system  of  equations  can  be  written  in  the  form 

T(x)=b 

IfT  (xp)  —  b,  then  xp  is  called  a  particular  solution  of  the  linear  system. 


Recall  that  a  system  is  called  homogeneous  if  every  equation  in  the  system  is  equal  to  0.  Suppose  we 
represent  a  homogeneous  system  of  equations  by  T  (x)  —  0.  It  turns  out  that  the  x  for  which  T  (x)  =  0  are 
part  of  a  special  set  called  the  null  space  of  T .  We  may  also  refer  to  the  null  space  as  the  kernel  of  T,  and 
we  write  ker(T). 

Consider  the  following  definition. 


We  may  also  refer  to  the  kernel  of  T  as  the  solution  space  of  the  equation  T  (x)  =  0. 
Consider  the  following  example. 


Example  5.64:  The  Kernel  of  the  Derivative 


Let  4  denote  the  linear  transformation  defined  on  /,  the  functions  which  are  defined  on  E  and  have 
a  continuous  derivative.  Find  ker  (4) . 


Solution.  The  example  asks  for  functions  /  which  the  property  that  ^  —  0.  As  you  may  know  from 
calculus,  these  functions  are  the  constant  functions.  Thus  ker  (4)  is  the  set  of  constant  functions.  4k 

Definition  5.63  states  that  ker  (T)  is  the  set  of  solutions  to  the  equation, 

T(x)  —  0 
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Since  we  can  write  T  (3c)  as  Ax,  you  have  been  solving  such  equations  for  quite  some  time. 

We  have  spent  a  lot  of  time  finding  solutions  to  systems  of  equations  in  general,  as  well  as  homo¬ 
geneous  systems.  Suppose  we  look  at  a  system  given  by  Ax  =  b,  and  consider  the  related  homogeneous 
system.  By  this,  we  mean  that  we  replace  b  by  0  and  look  at  Ax  —  0.  It  turns  out  that  there  is  a  very 
important  relationship  between  the  solutions  of  the  original  system  and  the  solutions  of  the  associated 
homogeneous  system.  In  the  following  theorem,  we  use  linear  transformations  to  denote  a  system  of 
equations.  Remember  that  T  (3c)  =  Ax. 


Theorem  5.65:  Particular  Solution  and  General  Solution 


Suppose  xp  is  a  solution  to  the  linear  system  given  by , 

T{x)=b 

Then  ify  is  any  other  solution  to  T  (3c)  =  b,  there  exists  xq  e  ker  (T)  such  that 

y  =  xp+x0 

Hence,  every  solution  to  the  linear  system  can  be  written  as  a  sum  of  a  particular  solution,  xp,  and  a 
solution  xq  to  the  associated  homogeneous  system  given  by  T  (3c)  =  0. 


Proof.  Consider  y  —  xp  —  y  +  (  —  l)xp.  Then  T  (y  —  xp)  =  T  (y)  —  T  (xp).  Since  y  andxp  are  both  solutions 
to  the  system,  it  follows  that  T  (y)  —b  and  T  (xp)  =  b. 

Hence,  T  (y)  —  T  (xp)  =  b  —  b  =  0.  Let  To  =  y—xp.  Then,  T  (3co)  =  0  soxo  is  a  solution  to  the  associated 
homogeneous  system  and  so  is  in  ker  (T) . 

Sometimes  people  remember  the  above  theorem  in  the  following  form.  The  solutions  to  the  system 
T  (x)  —  b  are  given  by  xp  +  ker  (T)  where  xp  is  a  particular  solution  to  T  (3c)  =  b. 

For  now,  we  have  been  speaking  about  the  kernel  or  null  space  of  a  linear  transformation  T.  However, 
we  know  that  every  linear  transformation  T  is  determined  by  some  matrix  A.  Therefore,  we  can  also  speak 
about  the  null  space  of  a  matrix.  Consider  the  following  example. 


Solution.  We  are  asked  to  find  <  x  :  Ax  =  0  > .  In  other  words  we  want  to  solve  the  system,  Ax  =  0.  Let 


324  Linear  Transformations 


x 

y 

z 

w 


.  Then  this  amounts  to  solving 


X 

12  3  0 

'  0  ' 

2  112 

y 

— 

0 

4  5  7  2 

z 

0 

w 

This  is  the  linear  system 

x  T  2y  T  2>z  —  0 
2x  +  y  +  z  +  2w  —  0 
4x  +  5y  +  7z  +  2w  —  0 

To  solve,  set  up  the  augmented  matrix  and  row  reduce  to  find  the  reduced  row-echelon  form. 


'  1 

0 

1 

4 

0  " 

'  1 

2 

3 

0 

0  ' 

3 

3 

2 

1 

1 

2 

0 

->•  • 

•  -* 

0 

1 

5 

3 

2 

3 

0 

4 

5 

7 

2 

0 

_  0 

0 

0 

0 

0  _ 

This  yields  x  =  ^z  —  and  y  —  Since  null  ( A )  consists  of  the  solutions  to  this  system,  it  consists 


vectors  of  the  form. 


\z-  fw 

r  1  i 

3 

r  4 1 

3 

2,.,  5  _ 

5 

2 

3^;  3^ 

=  z 

3 

+  w 

3 

Z 

1 

0 

W 

0  _ 

1  _ 

Consider  the  following  example. 


* 


5.9.  The  General  Solution  of  a  Linear  System  325 


Solution.  Note  the  matrix  of  this  system  is  the  same  as  the  matrix  in  Example  5.66.  Therefore,  from 
Theorem  5.65,  you  will  obtain  all  solutions  to  the  above  linear  system  by  adding  a  particular  solution  xp 
to  the  solutions  of  the  associated  homogeneous  system,  x.  One  particular  solution  is  given  above  by 


X 

'  1 " 

y 

1 

z 

2 

w 

1 

(5.6) 


Using  this  particular  solution  along  with  the  solutions  found  in  Example  5.66,  we  obtain  the  following 
solutions, 


r  1  i 

r  4 1 

3 

5 

3 

T  w 

3 

2 

3 

+ 

1 

1 

1 

0 

2 

0  _ 

1  _ 

1 

Hence,  any  solution  to  the  above  linear  system  is  of  this  form.  4k 


Exercises 


Exercise  5.9.1  Write  the  solution  set  of  the  following  system  as  a  linear  combination  of  vectors 


"  1  -1  2  ' 

X 

'  0  ' 

1  -2  1 

y 

= 

0 

3-4  5 

z 

0 

Exercise  5.9.2  Using  Problem  5.9.1  find  the  general  solution  to  the  following  linear  system. 


'  1 

-1  2  ' 

X 

'  l  ' 

1 

3 

-2  1 

-4  5 

y 

_  z  _ 

- 

2 

4 

Exercise  5.9.3  Write  the  solution  set  of  the  following  system  as  a  linear  combination  of  vectors 


1 

<N 

7 

o 

X 

"  0  " 

1  -2  1 

y 

— 

0 

.  1  ~4  5  . 

z 

0 

Exercise  5.9.4  Using  Problem  5.9.3  find  the  general  solution  to  the  following  linear  system. 


'  0  -1  2  ' 

X 

l  ' 

1  -2  1 

y 

= 

-l 

1  -4  5 

z 

l 

326  Linear  Transformations 


Exercise  5.9.5  Write  the  solution  set  of  the  following  system  as  a  linear  combination  of  vectors. 


"  1  -1  2  ' 

X 

'  0  ' 

1  -2  0 

y 

— 

0 

3-4  4 

z 

0 

Exercise  5.9.6  Using  Problem  5.9.5  find  the  general  solution  to  the  following  linear  system. 


"  1  -1  2  ' 

X 

'  l ' 

1  -2  0 

y 

— 

2 

3-4  4 

z 

4 

Exercise  5.9.7  Write  the  solution  set  of  the  following  system  as  a  linear  combination  of  vectors 


"  0  -1  2  ' 

X 

■  0  ' 

1  0  1 

y 

= 

0 

1  -2  5 

_  z  _ 

0 

Exercise  5.9.8  Using  Problem  5.9.7  find  the  general  solution  to  the  following  linear  system. 


1 

<N 

7 

o 

X 

1 ' 

1  0  1 

y 

— 

-1 

.  1  “2  5  . 

z 

1 

Exercise  5.9.9  Write  the  solution  set  of  the  following  system  as  a  linear  combination  of  vectors 


'  1  Oil' 

'  0  " 

1-110 

y 

0 

3-132 

z 

0 

3  3  0  3 

w 

0 

Exercise  5.9.10  Using  Problem  5.9.9  find  the  general  solution  to  the  following  linear  system. 


'  1  Oil' 

X 

'  1 ' 

1-110 

y 

2 

3-132 

z 

4 

3  3  0  3 

w 

3 
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Exercise  5.9.11  Write  the  solution  set  of  the  following  system  as  a  linear  combination  of  vectors 


'110  1' 

X 

'  0  ' 

2  112 

y 

0 

10  11 

z 

0 

0  0  0  0 

w 

0 

Exercise  5.9.12  Using  Problem  5.9.11  find  the  general  solution  to  the  following  linear  system. 


"  1  10  1" 

X 

2  ' 

2  112 

y  r 

-1 

1  011 

r 

z 

-3 

0-111 

w 

0 

Exercise  5.9.13  Write  the  solution  set  of  the  following  system  as  a  linear  combination  of  vectors 


"  1  10  1' 

X 

'  0  ' 

1-110 

y 

0 

3  112 

z 

0 

3  3  0  3 

w 

0 

Exercise  5.9.14  Using  Problem  5.9.13  find  the  general  solution  to  the  following  linear  system. 


"  1  10  1" 

X 

'  1 ' 

1-110 

y 

2 

3  112 

z 

4 

3  3  0  3 

w 

3 

Exercise  5.9.15  Write  the  solution  set  of  the  following  system  as  a  linear  combination  of  vectors 


'  1  10  1" 

X 

'  0  " 

2  112 

y 

0 

1  011 

z 

0 

0-111 

w 

0 

Exercise  5.9.16  Using  Problem  5.9.15  find  the  general  solution  to  the  following  linear  system. 


"  1  10  1" 

X 

2  ' 

2  112 

y 

-1 

1  0  11 

z 

-3 

0-111 

w 

1 

Exercise  5.9.17  Suppose  Ax  =  b  has  a  solution.  Explain  why  the  solution  is  unique  precisely  when  Ax  =  0 
has  only  the  trivial  solution. 


6.  Complex  Numbers 


6.1  Complex  Numbers 


Outcomes 


A.  Understand  the  geometric  significance  of  a  complex  number  as  a  point  in  the  plane. 

B.  Prove  algebraic  properties  of  addition  and  multiplication  of  complex  numbers,  and  apply 
these  properties.  Understand  the  action  of  taking  the  conjugate  of  a  complex  number. 

C.  Understand  the  absolute  value  of  a  complex  number  and  how  to  find  it  as  well  as  its  geometric 
significance. 


Although  very  powerful,  the  real  numbers  are  inadequate  to  solve  equations  such  as  x2  A  1  =0,  and  this 
is  where  complex  numbers  come  in.  We  define  the  number  i  as  the  imaginary  number  such  that  i2  =  —  1, 
and  define  complex  numbers  as  those  of  the  form  z  —  a  +  bi  where  a  and  b  are  real  numbers.  We  call  this 
the  standard  form,  or  Cartesian  form,  of  the  complex  number  z.  Then,  we  refer  to  a  as  the  real  part  of  z, 
and  b  as  the  imaginary  part  of  z.  It  turns  out  that  such  numbers  not  only  solve  the  above  equation,  but 
in  fact  also  solve  any  polynomial  of  degree  at  least  1  with  complex  coefficients.  This  property,  called  the 
Fundamental  Theorem  of  Algebra,  is  sometimes  referred  to  by  saying  C  is  algebraically  closed.  Gauss  is 
usually  credited  with  giving  a  proof  of  this  theorem  in  1797  but  many  others  worked  on  it  and  the  first 
completely  correct  proof  was  due  to  Argand  in  1806. 

Just  as  a  real  number  can  be  considered  as  a  point  on  the  line,  a  complex  number  z  —  a  +  bi  can  be 
considered  as  a  point  ( a,b )  in  the  plane  whose  x  coordinate  is  a  and  whose  y  coordinate  is  b.  For  example, 
in  the  following  picture,  the  point  z  =  3  +  2i  can  be  represented  as  the  point  in  the  plane  with  coordinates 
(3.2). 


•z  =  (3,2)  =  3  +  2/ 


-> 


Addition  of  complex  numbers  is  defined  as  follows. 

i^a  A  bi)  A  (c  A  df)  —  ( a  A  c)  A  (b  A  d')  i 
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This  addition  obeys  all  the  usual  properties  as  the  following  theorem  indicates. 


The  proof  of  this  theorem  is  left  as  an  exercise  for  the  reader. 

Now,  multiplication  of  complex  numbers  is  defined  the  way  you  would  expect,  recalling  that  i2  —  —  1. 

(a  +  bi)  (c  +  di)  —  ac  +  adi  +  bci  +  rbd 
—  (ac  —  bd)  +  (ad  +  be)  i 


Consider  the  following  examples. 


The  following  are  important  properties  of  multiplication  of  complex  numbers. 
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Theorem  6.3:  Properties  of  Multiplication  of  Complex  Numbers 


Let  z,  w  and  v  be  complex  numbers.  Then,  the  following  properties  of  multiplication  hold. 

•  Commutative  Law  for  Multiplication 

zw  =  wz 

•  Associative  Law  for  Multiplication 

(zw)  v  —  z  (wv) 


•  Multiplicative  Identity 


Iz  —  z 


•  Existence  of  Multiplicative  Inverse 

For  each  z  /  0,  there  exists  z~  1  such  that  zz  1  =  1 


•  Distributive  Law 


z(w  +  v)  —  zw  +  zv 


You  may  wish  to  verify  some  of  these  statements.  The  real  numbers  also  satisfy  the  above  axioms,  and 
in  general  any  mathematical  structure  which  satisfies  these  axioms  is  called  a  field.  There  are  many  other 
fields,  in  particular  even  finite  ones  particularly  useful  for  cryptography,  and  the  reason  for  specifying  these 
axioms  is  that  linear  algebra  is  all  about  fields  and  we  can  do  just  about  anything  in  this  subject  using  any 
field.  Although  here,  the  fields  of  most  interest  will  be  the  familiar  field  of  real  numbers,  denoted  as  R, 
and  the  field  of  complex  numbers,  denoted  as  C. 

An  important  construction  regarding  complex  numbers  is  the  complex  conjugate  denoted  by  a  hori¬ 
zontal  line  above  the  number,  z.  It  is  defined  as  follows. 


Geometrically,  the  action  of  the  conjugate  is  to  reflect  a  given  complex  number  across  the  x  axis. 
Algebraically,  it  changes  the  sign  on  the  imaginary  part  of  the  complex  number.  Therefore,  for  a  real 
number  a,  d  —  a. 
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Consider  the  following  computation. 

(. a  +  bi)  (a  +  bi )  =  (a  —  bi)  (a  +  bi) 

—  a2  +b2  —  ( ab  —  ab )  i  —  a2  +  b2 

Notice  that  there  is  no  imaginary  part  in  the  product,  thus  multiplying  a  complex  number  by  its  conjugate 
results  in  a  real  number. 


Division  of  complex  numbers  is  defined  as  follows.  Let  z  —  a  +  bi  and  w  —  c  +  di  be  complex  numbers 
such  that  c,  d  are  not  both  zero.  Then  the  quotient  z  divided  by  w  is 


Z  a  +  bi 
w  c  +  di 


a  +  bi  c  —  di 

_  X  _ 

c  +  di  c  —  di 
(■ ac  +  bd )  +  (be 
c2  +d2 


ad)i 


ac  +  bd  be  —  ad  . 
~c2+-~d2  ~c2+-d2^' 


In  other  words,  the  quotient  ^  is  obtained  by  multiplying  both  top  and  bottom  of  ^  by  w  and  then 
simplifying  the  expression. 
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Interestingly  every  nonzero  complex  number  a  +  bi  has  a  unique  multiplicative  inverse.  In  other  words, 
for  a  nonzero  complex  number  z„  there  exists  a  number  1  (or  * )  so  that  zz  1  =  1.  Note  that  z  —  a +  bi  is 
nonzero  exactly  when  a2  +  b2  ^  0,  and  its  inverse  can  be  written  in  standard  form  as  defined  now. 


Note  that  we  may  write  z  1  as  Both  notations  represent  the  multiplicative  inverse  of  the  complex 
number  z.  Consider  now  an  example. 


Another  important  construction  of  complex  numbers  is  that  of  the  absolute  value,  also  called  the  mod¬ 
ulus.  Consider  the  following  definition. 
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Definition  6.10:  Absolute  Value 


The  absolute  value,  or  modulus,  of  a  complex  number,  denoted  |z|  is  defined  as  follows. 

\a  +  bi\  —  \/  a2  +  b1 


Thus,  if  z  is  the  complex  number  z  —  a  +  hi,  it  follows  that 

|z|  =  (zz)1//2 

Also  from  the  definition,  if  z  =  a  +  bi  and  w >  —  c-\- di  are  two  complex  numbers,  then  \zw\  —  |z|  w  . 
Take  a  moment  to  verify  this. 

The  triangle  inequality  is  an  important  property  of  the  absolute  value  of  complex  numbers.  There  are 
two  useful  versions  which  we  present  here,  although  the  first  one  is  officially  called  the  triangle  inequality. 


Proof.  Let  z  —  a +  bi  and  w  —  c  +  di.  First  note  that 

zw  —  (a  +  bi)  ( c  —  di)  —  ac  +  bd  +  (be  —  ad)  i 

and  so  | ac  +  bd\  <  \zw\  —  |z|  |w| . 

Then, 

|z  +  w|2  =  (a  +  c  +  i(b  +  d))  (a  +  c  —  i  (b  +  d)) 

—  (a  fc)  T  (b  4~  d)  —a  -\-  c  -\-  2,ac  T  2  bd  T  b  -\-  d~ 
<  |z|2  +  | w|2  H- 2 |z|  |w|  =  (|z|  +  |w|)2 

Taking  the  square  root,  we  have  that 

\z  +  w\  <  \z\  +  M 

so  this  verifies  the  triangle  inequality. 

To  get  the  second  inequality,  write 

z  —  z  —  w  +  w,  w  —  w  —  z  +  z 


and  so  by  the  first  form  of  the  inequality  we  get  both: 


6.1.  Complex  N umbers  335 


Hence,  both  |z|  —  \w\  and  |w|  —  |z|  are  no  larger  than  \z  —  w\.  This  proves  the  second  version  because 
||z|  —  |w||  is  one  of  |z|  —  |w|  or  \w\  —  |z|.  4 

With  this  definition,  it  is  important  to  note  the  following.  You  may  wish  to  take  the  time  to  verify  this 
remark. 

Let  z  —  a  +  bi  and  w  —  c  +  di.  Then  \z  —  w\  —  y  (a  —  c)  +(b  —  d )  .  Thus  the  distance  between  the 
point  in  the  plane  determined  by  the  ordered  pair  (a,b)  and  the  ordered  pair  (c,d)  equals  \z  —  w\  where  z 
and  w  are  as  just  described. 

For  example,  consider  the  distance  between  (2,5)  and  (1,8) .  Letting  z  =  2  +  5i  and  w  —  1  +  8 i,z  —  w  = 
1  —  3 i,  (z  —  w)  (z  —  w)  —  (1  —  3/)  (1  +  3 i)  —  10  so  \z  —  w\  —  \/l0. 

Recall  that  we  refer  to  z  —  a  +  bi  as  the  standard  form  of  the  complex  number.  In  the  next  section,  we 
examine  another  form  in  which  we  can  express  the  complex  number. 


Exercises 


Exercise  6.1.1  Let  z  —  2  +  li  and  letw  —  3~  8  i.  Compute  the  following. 

(a)  z  +  w 

( b)  z  —  2w 

(c)  zw 

(d)  f 

Exercise  6.1.2  Letz—l—  4/.  Compute  the  following. 

(a)  z 

(b)  z~l 

(c)  \z\ 

Exercise  6.1.3  Let  z  =  3  +  5i  and  w  —  2  —  i.  Compute  the  following. 

(a)  zw 

(b)  \zw\ 

(c)  z~lw 
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Exercise  6.1.4  Ifz  is  a  complex  number,  show  there  exists  a  complex  number  w  with  \w\  —  1  and  wz  —  z  . 

Exercise  6.1.5  If  z,w  are  complex  numbers  prove  zw  —  z  w  and  then  show  by  induction  that  z.\  ■  ■  ■  z,m  = 
zf  •  ■  •  Zm •  Also  verify  that  Y!k=\Zk  =  L™=i  Zk-  In  words  this  says  the  conjugate  of  a  product  equals  the 
product  of  the  conjugates  and  the  conjugate  of  a  sum  equals  the  sum  of  the  conjugates. 

Exercise  6.1.6  Suppose  p{x)  —  anxM  +  an- ix””1  H - \-a\x  +  ao  where  all  the  cp  are  real  numbers.  Sup¬ 

pose  also  that  p  (z)  =  0  for  some  z  G  C.  Show  it  follows  that  p(z)  —  0  also. 

Exercise  6.1.7  I  claim  that  1  =  —  1.  Here  is  why. 

-1  =  i2  =  =  \/(-l)2  =  VT  =  1 

This  is  clearly  a  remarkable  result  but  is  there  something  wrong  with  it?  If  so,  what  is  wrong? 


6.2  Polar  Form 


Outcomes 


A.  Convert  a  complex  number  from  standard  form  to  polar  form,  and  from  polar  form  to  standard 
form. 


In  the  previous  section,  we  identified  a  complex  number  z  —  a +  bi  with  a  point  ( a ,  b )  in  the  coordinate 
plane.  There  is  another  form  in  which  we  can  express  the  same  number,  called  the  polar  form.  The  polar 
form  is  the  focus  of  this  section.  It  will  turn  out  to  be  very  useful  if  not  crucial  for  certain  calculations  as 
we  shall  soon  see. 

Suppose  z  =  a  +  bi  is  a  complex  number,  and  let  r  —  \J a 2  +  b2  —  |z| .  Recall  that  r  is  the  modulus  of  z 
.  Note  first  that 

“)J+©2  =  ^  =  * 
and  so  (y,y)  is  a  point  on  the  unit  circle.  Therefore,  there  exists  an  angle  9  (in  radians)  such  that 

„  a  .„  b 
cos  0  —  sin  6  =  - 
r  r 

In  other  words  0  is  an  angle  such  that  a  =  rcosO  and  b  =  rsin0,  that  is  0  —  cos^1  (a/r)  and  0  = 
sin-1  (b/ r).  We  call  this  angle  9  the  argument  of  z. 

We  often  speak  of  the  principal  argument  of  z.  This  is  the  unique  angle  9  G  (—7t,7t\  such  that 
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The  polar  form  of  the  complex  number  z  —  a  +  bi  —  r  (cos  9  +  i  sin  0)  is  for  convenience  written  as: 


z  = 


re 


W 


where  9  is  the  argument  of  z. 


When  given  z  —  re’e,  the  identity  e‘°  —  cos  9  +  i  sin  9  will  convert  z  back  to  standard  form.  Here  we 
think  of  e'e  as  a  short  cut  for  cos  9  +  i  sin  9.  This  is  all  we  will  need  in  this  course,  but  in  reality  e‘°  can  be 
considered  as  the  complex  equivalent  of  the  exponential  function  where  this  turns  out  to  be  a  true  equality. 


a  +  bi  —  re‘e 


Thus  we  can  convert  any  complex  number  in  the  standard  (Cartesian)  form  z  —  a-bbi  into  its  polar 
form.  Consider  the  following  example. 


Solution.  First,  find  r.  By  the  above  discussion,  r  —  V a2  +  b2  —  |z|.  Therefore, 

r  =  V 2 2  +  22  =  V%  =  2V2 

Now,  to  find  9,  we  plot  the  point  (2,2)  and  find  the  angle  from  the  positive  x  axis  to  the  line  between 
this  point  and  the  origin.  In  this  case,  9  —  45°  =  That  is  we  found  the  unique  angle  9  such  that 
9  =  cos-1(l/-\/2)  and  0  =  sin_1(l/V^). 

Note  that  in  polar  form,  we  always  express  angles  in  radians,  not  degrees. 

Hence,  we  can  write  z  as 
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z  =  2\/2e'4 


* 

Notice  that  the  standard  and  polar  forms  are  completely  equivalent.  That  is  not  only  can  we  transform 
a  complex  number  from  standard  form  to  its  polar  form,  we  can  also  take  a  complex  number  in  polar  form 
and  convert  it  back  to  standard  form. 


Solution.  Let  z  —  2e2m^  be  the  polar  form  of  a  complex  number.  Recall  that  e'9  =  cos  9  +  i  sin  9.  There¬ 
fore  using  standard  values  of  sin  and  cos  we  get: 

z  =  2ea7l/3  = 


which  is  the  standard  form  of  this  complex  number.  4k 

You  can  always  verify  your  answer  by  converting  it  back  to  polar  form  and  ensuring  you  reach  the 
original  answer. 


2(cos(2;r/3)  +  /sin(27r/3)) 


—  1  +  V3i 


Exercises 


Exercise  6.2.1  Let  z  —  3  +  3i  be  a  complex  number  written  in  standard  form.  Convert  z  to  polar  form,  and 
write  it  in  the  form  z  =  re'9. 

Exercise  6.2.2  Let  z.  —  2 i  be  a  complex  number  written  in  standard  form.  Convert  z.  to  polar  form,  and 
write  it  in  the  form  z  —  re'9. 

In- 

Exercise  6.2.3  Let  z  =  4-e  ■' '  be  a  complex  number  written  in  polar  form.  Convert  z  to  standard  form,  and 
write  it  in  the  form  z  —  a  +  bi. 

71  ■ 

Exercise  6.2.4  Let  z.  —  —\e<>‘  be  a  complex  number  written  in  polar  form.  Convert  z.  to  standard  form, 
and  write  it  in  the  form  z  —  a  +  bi. 

Exercise  6.2.5  Ifz  and  w  are  two  complex  numbers  and  the  polar  form  ofz  involves  the  angle  9  while  the 
polar  form  ofw  involves  the  angle  (j),  show  that  in  the  polar  form  for  zw  the  angle  involved  is  9  +  (j). 
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6.3  Roots  of  Complex  Numbers 


A  fundamental  identity  is  the  formula  of  De  Moivre  with  which  we  begin  this  section. 


Proof.  The  proof  is  by  induction  on  n.  It  is  clear  the  formula  holds  if  n  —  1.  Suppose  it  is  true  for  n.  Then, 
consider  n  +  1 . 

(r(cos0  +  /sin0))n+1  =  (r  (cos  0  +  /sin0))"  (r(cos0  +  /sin0)) 
which  by  induction  equals 

=  r',+1  (cos/70  +  /sin/70)  (cos0  +  /sin0) 

=  r'7+1  ((cos 77  0  cos  0  —  sin/70  sin  0)  +  i  (sin/70  cos  0  +  cos/70  sin  0)) 

=  rn+1  (cos(/7+ 1)  0 +  /sin(/7+ 1)  0) 

by  the  formulas  for  the  cosine  and  sine  of  the  sum  of  two  angles.  4k 

The  process  used  in  the  previous  proof,  called  mathematical  induction  is  very  powerful  in  Mathematics 
and  Computer  Science  and  explored  in  more  detail  in  the  Appendix. 

Now,  consider  a  corollary  of  Theorem  6.15. 


Corollary  6.16:  Roots  of  Complex  Numbers 


Let  z  be  a  non  zero  complex  number.  Then  there  are  always  exactly  k  many  k,h  roots  of  z  in  C. 


Proof.  Let  z  —  a  +  bi  and  let  z  —  |z|  (cos  0  +  /sin  0)  be  the  polar  form  of  the  complex  number.  By  De 
Moivre’s  theorem,  a  complex  number 

w  —  relCl  =  r  (cos  a  +  i  sin  a ) 

is  a  kth  root  of  z  if  and  only  if 

wk  —  ( rem)k  =  rkelka  =  r*  (cos ka  +  i sin ka)  —  \z\  (cos0  +  /sin0) 
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This  requires  rk  —  \z\  and  so  r 
happen  if 

for  £  an  integer.  Thus 


\z\l^k ■  Also,  both  cos  (to)  =  cos0  and  sin  (to) 
to  =  0  +  2  Ik 


6  +  2£n 

a  = - - - ,  £  —  0, 1,2,--  •  ,k-  1 


sin0.  This  can  only 


and  so  the  kth  roots  of  z  are  of  the  form 


\\/k 


cos 


G+2£n 


+  i  sin 


G+2£n 


£  —  0,1,2,-  ■  ■  ,k—  l 


Since  the  cosine  and  sine  are  periodic  of  period  2n,  there  are  exactly  k  distinct  numbers  which  result  from 
this  formula.  4 

The  procedure  for  finding  the  k  k,h  roots  of  z  G  C  is  as  follows. 


Notice  that  once  the  roots  are  obtained  in  the  final  step,  they  can  then  be  converted  to  standard  form 
if  necessary.  Let’s  consider  an  example  of  this  concept.  Note  that  according  to  Corollary  6.16,  there  are 
exactly  3  cube  roots  of  a  complex  number. 
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Solution.  First,  convert  each  number  to  polar  form:  z  —  re10  and  i  —  \el7t /2.  The  equation  now  becomes 

(, re i0)3  =  r3em  =  lei7l/ 2 

Therefore,  the  two  equations  that  we  need  to  solve  are  r3  —  1  and  3 id  —  in/ 2.  Given  that  r  G  R  and  r3  —  1 
it  follows  that  r  —  1 . 

Solving  the  second  equation  is  as  follows.  First  divide  by  i.  Then,  since  the  argument  of  i  is  not  unique 
we  write  30  =  n/2-\-2n£  for  £  —  0, 1,2. 

30  =  n/2  +  2n£  forf  =  0, 1,2 
2 

0  =  n/6+  -n£  for£  =  0, 1,2 

For  £  —  0: 

2 

0  =  n/6  +  -7r(0)  =  n/6 

For  £  —  l: 

2  5 

0  =  n/6+-n(l)  =  -n 

3  6 

For  £  =  2: 

0  =  tt/6+  -k(2)  =  -k 
Therefore,  the  three  roots  are  given  by 

lei71/6,  le1*”,  lei3iK 


Written  in  standard  form,  these  roots  are,  respectively, 

a/3  .1  V3  .1 

- h  l  —  ? - b  l  —  9  — l 

2  2  2  2 


The  ability  to  find  kth  roots  can  also  be  used  to  factor  some  polynomials. 


Example  6.19:  Solving  a  Polynomial  Equation 


Factor  the  polynomial  x3  —  27. 


/  _  i 

Solution.  First  find  the  cube  roots  of  27.  By  the  above  procedure  ,  these  cube  roots  are  3, 3  I  —  +  i—^~ 
-1 

and  3  (  — - i—^~  ) .  You  may  wish  to  verify  this  using  the  above  steps. 
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Therefore,  x3  —  27  = 


(x-3)  [  x-3  | 


3,^-4 


Note  also  ^x  — 3  +  ^x  — 3  1  —  /T^j  'j  =  x2  +  3x  +  9  and  so 

x3  —  27  =  (x  —  3)  (x2  +  3x  +  9) 

where  the  quadratic  polynomial  x2  +  3x  +  9  cannot  be  factored  without  using  complex  numbers. 

Note  that  even  though  the  polynomial  x3  —  27  has  all  real  coefficients,  it  has  some  complex  zeros, 

3  (t + 'f )  •  - 3  (t  -  4)  •  —  -  —  co~ of  each  11 1S  ^ 

the  case  that  if  a  polynomial  has  real  coefficients  and  a  complex  root,  it  will  also  have  a  root  equal  to  the 
complex  conjugate. 


Exercises 


Exercise  6.3.1  Give  the  complete  solution  to  x4  +16  =  0. 

Exercise  6.3.2  Find  the  complex  cube  roots  of  8. 

Exercise  6.3.3  Find  the  four  fourth  roots  of  16. 

Exercise  6.3.4  De  Moivre’s  theorem  says  [r  (cost  +  /sin/)]”  =  r"  (cos nt  +  i sin nt)  for  n  a  positive  integer. 
Does  this  formula  continue  to  hold  for  all  integers  n,  even  negative  integers ?  Explain. 

Exercise  6.3.5  Factor  x3  +  8  as  a  product  of  linear  factors.  Hint:  Use  the  result  of  6.3.2. 

Exercise  6.3.6  Write  x3  +  27  in  the  form  (x  +  3)  (x2  +  ax  +  b)  where  x2  +  ax  +  b  cannot  be  factored  any 
more  using  only  real  numbers. 

Exercise  6.3.7  Completely  factor  x4  +  16  as  a  product  of  linear  factors.  Hint:  Use  the  result  of  6.3.3. 

Exercise  6.3.8  Factor  x4  +  16  as  the  product  of  two  quadratic  polynomials  each  of  which  cannot  be 
factored  further  without  using  complex  numbers. 

Exercise  6.3.9  Ifn  is  an  integer,  is  it  cdways  true  that  (cosO  —  /  sin  0j”  =  cos  (nO)  —  /  sin  (nd)?  Explain. 
Exercise  6.3.10  Suppose  p  (x)  =  anxn  +  an_\xn~ 1  H - b  a  \  x  +  ao  is  a  polynomial  and  it  has  n  zeros, 


Z 1 5 Z2 ?  *  *  *  ?  Zn 
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listed  according  to  multiplicity,  (zis  a  root  of  multiplicity  m  if  the  polynomial  f  (x)  =  (x  —  z)m  divides  p  (x) 
but  (x  —  z)  f  (x)  does  not.)  Show  that 

p  (x)  =  a„  (x  -  zi)  (x  -zi)---(x- zn ) 


6.4  The  Quadratic  Formula 


Outcomes 


A.  Use  the  Quadratic  Formula  to  find  the  complex  roots  of  a  quadratic  equation. 


The  roots  (or  solutions)  of  a  quadratic  equation  ax 2  +  bx  +  c  —  0  where  a ,  b,  c  are  real  numbers  are 
obtained  by  solving  the  familiar  quadratic  formula  given  by 

— b  ±fb2  —  4  ac 


When  working  with  real  numbers,  we  cannot  solve  this  formula  if  b2  —  4 ac  <  0.  However,  complex 
numbers  allow  us  to  find  square  roots  of  negative  numbers,  and  the  quadratic  formula  remains  valid  for 
finding  roots  of  the  corresponding  quadratic  equation.  In  this  case  there  are  exactly  two  distinct  (complex) 
square  roots  of  b2  —  4ac,  which  are  i  f  4a c  —  b2  and  —i\j4ac  —  b2. 

Here  is  an  example. 


Solution.  In  terms  of  the  quadratic  equation  above,  a  —  1,  b  =  2,  and  c  =  5.  Therefore,  we  can  use  the 
quadratic  formula  with  these  values,  which  becomes 


x  = 


2 a  2(1) 

Solving  this  equation,  we  see  that  the  solutions  are  given  by 

—2i  ±  i/4--20  -2  ±4  i 


=  — 1±2  i 


We  can  verify  that  these  are  solutions  of  the  original  equation.  We  will  show  x  =  —  1  +  2/  and  leave 
x  —  —  1  —  2z  as  an  exercise. 


344  Complex  Numbers 


x ^  +  2x  +  5 


(  — 1  +  2i)2  +  2(—l  +  2i)  +  5 
1  —  4i  —  4  —  2  +  4i  +  5 
0 


Hence  a;  =  —  1  +  2/'  is  a  solution.  4 

What  if  the  coefficients  of  the  quadratic  equation  are  actually  complex  numbers?  Does  the  formula 
hold  even  in  this  case?  The  answer  is  yes.  This  is  a  hint  on  how  to  do  Problem  6.4.4  below,  a  special  case 
of  the  fundamental  theorem  of  algebra,  and  an  ingredient  in  the  proof  of  some  versions  of  this  theorem. 

Consider  the  following  example. 


Solution.  In  terms  of  the  quadratic  equation  above,  a  =  1,  b  =  —2/,  and  c  —  —5.  Therefore,  we  can  use 
the  quadratic  formula  with  these  values,  which  becomes 


x  = 


-b±Vb^kTc  l)(-5) 


2  a  2(1) 

Solving  this  equation,  we  see  that  the  solutions  are  given  by 

2/±  V-4  +  20  2/ ±4 


x  = 


=  i±  2 


We  can  verify  that  these  are  solutions  of  the  original  equation.  We  will  show  x 
x  —  i  —  2  as  an  exercise. 


i  +  2  and  leave 


x2  —  2  ix  —  5 


(i  +  2)2  —  2i(i  +  2)  —  5 
— 1+4/  +  4  +  2  — 4z  — 5 
0 


Hence  x  —  i  +  2  is  a  solution.  4 

We  conclude  this  section  by  stating  an  essential  theorem. 


Theorem  6.22:  The  Fundamental  Theorem  of  Algebra 


Any  polynomial  of  degree  at  least  1  with  complex  coefficients  has  a  root  which  is  a  complex  number. 
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Exercises 


Exercise  6.4.1  Show  that  1  +  i,  2  +  i  are  the  only  two  roots  to 

p  (x)  =  x2  —  (3  +  2i)  x  +  ( 1  +  3/) 

Hence  complex  zeros  do  not  necessarily  come  in  conjugate  pairs  if  the  coefficients  of  the  equation  are  not 
real. 


Exercise  6.4.2  Give  the  solutions  to  the  following  quadratic  equations  having  real  coefficients. 


(a) 

x2- 

-  2x  +  2  = 

(b) 

3x2 

+ x  +  3  = 

(c) 

x2- 

-  6x+  13  = 

id) 

x2  +  4x  +  9  = 

(e) 

4x2 

+  4x  +  5  = 

Exercise  6.4.3  Give  the  solutions  to  the  following  quadratic  equations  having  complex  coefficients. 

(a)  x2  +  2x  +  1  +  /  =  0 

(b)  4x2+4/x  —  5  =  0 

(c)  4x2  +  (4  +  4z)  x  +  1  +  2i  =  0 

(d)  x2  —  4/x  —  5  =  0 

(e)  3x2  +  (1  —  i)x  +  3i  =  0 


Exercise  6.4.4  Prove  the  fundamental  theorem  of  algebra  for  quadratic  polynomials  having  coefficients 
in  C.  That  is,  show >  that  an  equation  of  the  form 

ax2  +  bx  +  c  —  0  where  a,b,c  are  complex  numbers,  a  f  0  has  a  complex  solution.  Hint:  Consider  the 
fact,  noted  earlier  that  the  expressions  given  from  the  quadratic  formula  do  in  fact  serve  as  solutions. 


7.  Spectral  Theory 


7.1  Eigenvalues  and  Eigenvectors  of  a  Matrix 


Outcomes 


A.  Describe  eigenvalues  geometrically  and  algebraically. 

B.  Find  eigenvalues  and  eigenvectors  for  a  square  matrix. 

Spectral  Theory  refers  to  the  study  of  eigenvalues  and  eigenvectors  of  a  matrix.  It  is  of  fundamental 
importance  in  many  areas  and  is  the  subject  of  our  study  for  this  chapter. 

7.1.1.  Definition  of  Eigenvectors  and  Eigenvalues 


In  this  section,  we  will  work  with  the  entire  set  of  complex  numbers,  denoted  by  C.  Recall  that  the  real 
numbers,  R  are  contained  in  the  complex  numbers,  so  the  discussions  in  this  section  apply  to  both  real  and 
complex  numbers. 

To  illustrate  the  idea  behind  what  will  be  discussed,  consider  the  following  example. 


Solution.  First,  compute  AX  for 


X  = 


5 

-4 

3 
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This  product  is  given  by 


"  0 

5 

-10  ' 

'  -5  ' 

'  -50  ' 

"  -5  " 

0 

22 

16 

-4 

= 

-40 

=  10 

-4 

0 

-9 

-2 

3 

30 

3 

In  this  case,  the  product  AX  resulted  in  a  vector  which  is  equal  to  10  times  the  vector  X.  In  other 
words,  AX  =  10X. 

Let’s  see  what  happens  in  the  next  product.  Compute  AX  for  the  vector 


1 

0 

0 


This  product  is  given  by 


‘  0 

5 

-10  ' 

'  1  ' 

'  0  ' 

'  1 " 

0 

22 

16 

0 

= 

0 

-0 

0 

0 

-9 

-2 

0 

0 

0 

In  this  case,  the  product  AX  resulted  in  a  vector  equal  to  0  times  the  vector  X,  AX  =  OX. 
Perhaps  this  matrix  is  such  that  AX  results  in  kX,  for  every  vector  X.  However,  consider 


'  0  5  -10  ' 

'  1 ' 

-5  ' 

0  22  16 

1 

= 

38 

1 

<N 

i 

o\ 

1 

o 

_ 1 

1 

-11 

In  this  case,  AX  did  not  result  in  a  vector  of  the  form  kX  for  some  scalar  k.  4k 

There  is  something  special  about  the  first  two  products  calculated  in  Example  7.1.  Notice  that  for 
each,  AX  —  kX  where  k  is  some  scalar.  When  this  equation  holds  for  some  X  and  k,  we  call  the  scalar 
k  an  eigenvalue  of  A.  We  often  use  the  special  symbol  X  instead  of  k  when  referring  to  eigenvalues.  In 
Example  7.1,  the  values  10  and  0  are  eigenvalues  for  the  matrix  A  and  we  can  label  these  as  X\  =  10  and 

?L2  —  0. 

When  AX  —  XX  for  some  X  /  0,  we  call  such  an  X  an  eigenvector  of  the  matrix  A.  The  eigenvectors 
of  A  are  associated  to  an  eigenvalue.  Hence,  if  X\  is  an  eigenvalue  of  A  and  AX  =  A|X,  we  can  label  this 
eigenvector  as  Xi.  Note  again  that  in  order  to  be  an  eigenvector,  X  must  be  nonzero. 

There  is  also  a  geometric  significance  to  eigenvectors.  When  you  have  a  nonzero  vector  which,  when 
multiplied  by  a  matrix  results  in  another  vector  which  is  parallel  to  the  first  or  equal  to  0,  this  vector  is 
called  an  eigenvector  of  the  matrix.  This  is  the  meaning  when  the  vectors  are  in  W\ 

The  formal  definition  of  eigenvalues  and  eigenvectors  is  as  follows. 
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Definition  7.2:  Eigenvalues  and  Eigenvectors 


Let  A  be  an  nx  n  matrix  and  let  X  e  C'!  be  a  nonzero  vector  for  which 

AA  =  AA  (7.1) 

for  some  scalar  A .  Then  A  is  called  an  eigenvalue  of  the  matrix  A  and  X  is  called  an  eigenvector  of 
A  associated  with  A  ,  or  a  A  -eigenvector  of  A. 

The  set  of  all  eigenvalues  of  annxn  matrix  A  is  denoted  by  o  (A)  and  is  referred  to  as  the  spectrum 
of  A. 


The  eigenvectors  of  a  matrix  A  are  those  vectors  X  for  which  multiplication  by  A  results  in  a  vector  in 
the  same  direction  or  opposite  direction  to  X.  Since  the  zero  vector  0  has  no  direction  this  would  make  no 
sense  for  the  zero  vector.  As  noted  above,  0  is  never  allowed  to  be  an  eigenvector. 

Let’s  look  at  eigenvectors  in  more  detail.  Suppose  X  satisfies  7.1.  Then 

AX  —  XX  —  0 
or 

(A  — A/)  A  =  0 

for  some  X  ^  0.  Equivalently  you  could  write  (A I  —  A)  X  =  0,  which  is  more  commonly  used.  Hence, 
when  we  are  looking  for  eigenvectors,  we  are  looking  for  nontrivial  solutions  to  this  homogeneous  system 
of  equations ! 

Recall  that  the  solutions  to  a  homogeneous  system  of  equations  consist  of  basic  solutions,  and  the 
linear  combinations  of  those  basic  solutions.  In  this  context,  we  call  the  basic  solutions  of  the  equation 
(A I  —  A)  X  —  0  basic  eigenvectors.  It  follows  that  any  (nonzero)  linear  combination  of  basic  eigenvectors 
is  again  an  eigenvector. 

Suppose  the  matrix  (A/  — A)  is  invertible,  so  that  (A/  — A)-1  exists.  Then  the  following  equation 
would  be  true. 


X  —  IX 

=  ((A7™A)_1  (A/— A))  A 

=  (A/-A)_1((A/-A)A) 

-  (A/-A)_10 

-  0 

This  claims  that  A  =  0.  However,  we  have  required  that  A  ^  0.  Therefore  (A/  —  A)  cannot  have  an  inverse! 
Recall  that  if  a  matrix  is  not  invertible,  then  its  determinant  is  equal  to  0.  Therefore  we  can  conclude 

that 

det(A/— A)  =  0  (7.2) 

Note  that  this  is  equivalent  to  det  (A  —  A  I)  —  0. 

The  expression  det(x7  —  A)  is  a  polynomial  (in  the  variable  x)  called  the  characteristic  polynomial 
of  A,  and  det(x7  —  A)  =  0  is  called  the  characteristic  equation.  For  this  reason  we  may  also  refer  to  the 
eigenvalues  of  A  as  characteristic  values,  but  the  former  is  often  used  for  historical  reasons. 
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The  following  theorem  claims  that  the  roots  of  the  characteristic  polynomial  are  the  eigenvalues  of  A. 
Thus  when  7.2  holds,  A  has  a  nonzero  eigenvector. 


Theorem  7.3:  The  Existence  of  an  Eigenvector 


Let  A  be  an  n  x  n  matrix  and  suppose  det  (XI  —  A)  —  0  for  some  X  G  C. 

Then  X  is  an  eigenvalue  of  A  and  thus  there  exists  a  nonzero  vector  X  e  C"  such  that  AX  =  XX. 


Proof.  For  A  an  n  x  n  matrix,  the  method  of  Laplace  Expansion  demonstrates  that  det  (XI  —  A)  is  a  polyno¬ 
mial  of  degree  n.  As  such,  the  equation  7.2  has  a  solution  X  E  C  by  the  Fundamental  Theorem  of  Algebra. 
The  fact  that  X  is  an  eigenvalue  is  left  as  an  exercise.  4k 

7.1.2.  Finding  Eigenvectors  and  Eigenvalues 


Now  that  eigenvalues  and  eigenvectors  have  been  defined,  we  will  study  how  to  find  them  for  a  matrix  A. 
First,  consider  the  following  definition. 


Definition  7.4:  Multiplicity  of  an  Eigenvalue 


Let  A  be  annx  n  matrix  with  characteristic  polynomial  given  by  det  (xl  —  A) .  Then,  the  multiplicity 
of  an  eigenvalue  X  of  A  is  the  number  of  times  X  occurs  as  a  root  of  that  characteristic  polynomial. 


2 

For  example,  suppose  the  characteristic  polynomial  of  A  is  given  by  (x  —  2)  .  Solving  for  the  roots  of 
this  polynomial,  we  set  (x  —  2)  =0  and  solve  for  x.  We  find  that  X  —  2  is  a  root  that  occurs  twice.  Hence, 
in  this  case,  X  —  2  is  an  eigenvalue  of  A  of  multiplicity  equal  to  2. 

We  will  now  look  at  how  to  find  the  eigenvalues  and  eigenvectors  for  a  matrix  A  in  detail.  The  steps 
used  are  summarized  in  the  following  procedure. 


Procedure  7.5:  Finding  Eigenvalues  and  Eigenvectors 


Let  A  be  an  n  x  n  matrix. 

1.  First,  find  the  eigenvalues  X  of  A  by  solving  the  equation  det(x/  —  A)  =  0. 

2.  For  each  X ,  find  the  basic  eigenvectors  by  finding  the  basic  solutions  to  (XI  —  A)  X  —  0. 
To  verify  your  work,  make  sure  that  AX  =  XX  for  each  X  and  associated  eigenvector  X . 


We  will  explore  these  steps  further  in  the  following  example. 


Example  7.6:  Find  the  Eigenvalues  and  Eigenvectors 

Let  A  — 

'-52' 
-7  4 

.  Find  its  eigenvalues  and  eigenvectors. 
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Solution.  We  will  use  Procedure  7.5.  First  we  find  the  eigenvalues  of  A  by  solving  the  equation 

det(jc/— A)  =  0 


This  gives 


det  fx 


1 

0 


0 

1 

det 


x  +  5  —2 

7  x  —  4 


0 

0 


Computing  the  determinant  as  usual,  the  result  is 

x2  +  x  —  6  =  0 


Solving  this  equation,  we  find  that  X\—2  and  X2  =  —3. 

Now  we  need  to  find  the  basic  eigenvectors  for  each  A .  First  we  will  find  the  eigenvectors  for  Ai  —2. 
We  wish  to  find  all  vectors  1^0  such  that  AX  =  2X.  These  are  the  solutions  to  (2 1  —  A)X  —  0. 


X 

'  0  ' 

_  y  _ 

0 

X 

'  0  ' 

_  y  _ 

0 

The  augmented  matrix  for  this  system  and  corresponding  reduced  row-echelon  form  are  given  by 


7  -2 
7  -2 


— >■ - > 


1  -7 1  0 

0  0  0 


The  solution  is  any  vector  of  the  form 


=  s 


Multiplying  this  vector  by  7  we  obtain  a  simpler  description  for  the  solution  to  this  system,  given  by 

'  2  ' 


This  gives  the  basic  eigenvector  for  X\  —  2  as 


2 

7 


To  check,  we  verify  that  AX  —  2X  for  this  basic  eigenvector. 
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'-52' 

'  2  ' 

4  ' 

_  O 

'  2  ' 

-7  4 

7 

14 

—  Z 

7 

This  is  what  we  wanted,  so  we  know  this  basic  eigenvector  is  correct. 

Next  we  will  repeat  this  process  to  find  the  basic  eigenvector  for  X2  —  —3.  We  wish  to  find  all  vectors 
X  ^0  such  that  AX  =  —  3X.  These  are  the  solutions  to  ((—3)7  —  A)X  =  0. 


1 

0  ' 

-5 

2  ' 

\ 

X 

'  0  ' 

0 

1 

-7 

4 

) 

_  y  _ 

0 

'  2 

-2 

X 

'  0  ' 

7 

-7 

.  y  _ 

0 

The  augmented  matrix  for  this  system  and  corresponding  reduced  row-echelon  form  are  given  by 


2-2  0 
7-7  0 


— >• - )• 


1  -1  0 

0  0  0 


The  solution  is  any  vector  of  the  form 


s 

s 


1 

1 


This  gives  the  basic  eigenvector  for  A2  =  —  3  as 


1 

1 


To  check,  we  verify  that  AX  —  —  3X  for  this  basic  eigenvector. 


- 1 

i 

K> 

_ 1 

'  1  ' 

"  -3  ' 

■  1  ■ 

I> 

1 

_ 1 

1 

-3 

_ _ 2 

1 

This  is  what  we  wanted,  so  we  know  this  basic  eigenvector  is  correct.  4k 

The  following  is  an  example  using  Procedure  7.5  for  a  3  x  3  matrix. 


r  1 

Example  7.7:  Find  the  Eigenvalues  and  Eigenvectors 

Find  the  eigenvalues  and  eigenvectors  for  1 

A  = 

the  matrix 

5  -10  -5  ' 
2  14  2 

-4  -8  6 

Solution.  We  will  use  Procedure  7.5.  First  we  need  to  find  the  eigenvalues  of  A.  Recall  that  they  are  the 
solutions  of  the  equation 


det(x7  —  A)  =  0 


In  this  case  the  equation  is 


which  becomes 
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( 

"  1 

0 

0  ' 

det  x 

0 

1 

0 

— 

V 

0 

0 

1 

5 

2 

-4 


-10  -5 
14  2 

-8  6 


=  0 


det 


x-5  10 

-2  x  — 14 
4  8 


5 

-2 
x  —  6 


=  0 


Using  Laplace  Expansion,  compute  this  determinant  and  simplify.  The  result  is  the  following  equation. 

(x-5)  (x2- 20* +100)  -0 

Solving  this  equation,  we  find  that  the  eigenvalues  are  X\  —  5,^2  —  10  and  A3  =  10.  Notice  that  10  is 
a  root  of  multiplicity  two  due  to 

x2  —  20*  +  100  =  (x  —  10)2 

Therefore,  A2  =  10  is  an  eigenvalue  of  multiplicity  two. 

Now  that  we  have  found  the  eigenvalues  for  A,  we  can  compute  the  eigenvectors. 

First  we  will  find  the  basic  eigenvectors  for  X\  =  5.  In  other  words,  we  want  to  find  all  non-zero  vectors 
X  so  that  AX  =  5X.  This  requires  that  we  solve  the  equation  (51  — A)  X  =  0  for  X  as  follows. 


10  0' 

5  -10  -5  ' 

\ 

X 

'  0  ' 

0  1  0 

— 

2  14  2 

y 

= 

0 

0  0  1 

-4  -8  6 

/ 

z 

0 

That  is  you  need  to  find  the  solution  to 


0  10 

-2  -9 
4  8 


5 

-2 

-1 


X 

'  0  ' 

y 

= 

0 

_  z  _ 

0 

By  now  this  is  a  familiar  problem.  You  set  up  the  augmented  matrix  and  row  reduce  to  get  the  solution. 
Thus  the  matrix  you  must  row  reduce  is 


0 

10 

5 

0  ' 

-2 

-9 

-2 

0 

4 

8 

-1 

0 

"  1 

0 

5 

4 

0  ' 

0 

1 

1 

2 

0 

_  0 

0 

0 

0  _ 

The  reduced  row-echelon  form  is 
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and  so  the  solution  is  any  vector  of  the  form 


r  5„  1 

r  5  1 

4S 

4 

—  s 

1 

2 

S 

1  _ 

where  set. 
as  given  by 


If  we  multiply  this  vector  by  4,  we  obtain  a  simpler  description  for  the  solution  to  this  system, 


t 


5 

-2 

4 


(7.3) 


where  t  G  M.  Here,  the  basic  eigenvector  is  given  by 


*1  = 


5 

-2 

4 


Notice  that  we  cannot  let  t  =  0  here,  because  this  would  result  in  the  zero  vector  and  eigenvectors  are 
never  equal  to  0!  Other  than  this  value,  every  other  choice  of  t  in  7.3  results  in  an  eigenvector. 

It  is  a  good  idea  to  check  your  work!  To  do  so,  we  will  take  the  original  matrix  and  multiply  by  the 
basic  eigenvector  X\ .  We  check  to  see  if  we  get  5Xi . 


1 

Ln 

1 

O 

1 

_ 1 

5  ' 

25  ' 

5  " 

2  14  2 

-2 

= 

-10 

=  5 

-2 

1 

kO 

00 

1 

T7l" 

1 

_ 1 

4 

20 

4 

This  is  what  we  wanted,  so  we  know  that  our  calculations  were  correct. 


Next  we  will  find  the  basic  eigenvectors  for  A2,  A3  = 
equation, 


( 

'  1 

0 

0  ' 

10 

0 

1 

0 

— 

V 

0 

0 

1 

5  -10 
2  14 

-4  -8 


10.  These  vectors  are  the  basic  solutions  to  the 


X 

'  0  ' 

y 

= 

0 

J 

z  _ 

0 

That  is  you  must  find  the  solutions  to 


5  10  5 

-2  -4  -2 
4  8  4 


X 

'  0  ' 

y 

— 

0 

z 

0 

Consider  the  augmented  matrix 


5 

10 

5 

0  ' 

-2 

-4 

-2 

0 

4 

8 

4 

0 

The  reduced  row-echelon  form  for  this  matrix  is 


"  1 

2 

1 

0  ' 

0 

0 

0 

0 

0 

0 

0 

0 
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and  so  the  eigenvectors  are  of  the  form 


—2  s  —  t 

"  -2  ' 

"  -1  ' 

s 

=  s 

1 

+  t 

0 

t 

0 

1 

Note  that  you  can’t  pick  t  and  s  both  equal  to  zero  because  this  would  result  in  the  zero  vector  and 
eigenvectors  are  never  equal  to  zero. 

Here,  there  are  two  basic  eigenvectors,  given  by 


"  -2  ' 

‘  -1  ' 

*2  = 

1 

,*3  = 

0 

0 

1 

Taking  any  (nonzero)  linear  combination  of  X2  and  X3  will  also  result  in  an  eigenvector  for  the  eigen¬ 
value  A  =  10.  As  in  the  case  for  A  =  5,  always  check  your  work!  For  the  first  basic  eigenvector,  we  can 
check  AX2  —  IOX2  as  follows. 


5  -10  -5  ' 

"  -1  ' 

'  -10  ' 

"  -1 " 

2  14  2 

0 

= 

0 

=  10 

0 

-4  -8  6 

1 

10 

1 

This  is  what  we  wanted.  Checking  the  second  basic  eigenvector,  X3,  is  left  as  an  exercise.  4|k 

It  is  important  to  remember  that  for  any  eigenvector  X,  X  ^  0.  However,  it  is  possible  to  have  eigen¬ 
values  equal  to  zero.  This  is  illustrated  in  the  following  example. 


Solution.  First  we  find  the  eigenvalues  of  A.  We  will  do  so  using  Definition  7.2. 
In  order  to  find  the  eigenvalues  of  A,  we  solve  the  following  equation. 


det(x7— A)  =  det 


x  —  2  -2  2 

-1  x  —  3  1 

1  -1  x  —  1 


=  0 


This  reduces  to  x3  —  6x2  +  8x  =  0.  You  can  verify  that  the  solutions  are  Ai  =  0,  A2  =  2,  A3  =  4.  Notice 
that  while  eigenvectors  can  never  equal  0,  it  is  possible  to  have  an  eigenvalue  equal  to  0. 

Now  we  will  find  the  basic  eigenvectors.  For  Ai  =  0,  we  need  to  solve  the  equation  (07  —  A)  X  —  0. 
This  equation  becomes  —AX  =  0,  and  so  the  augmented  matrix  for  finding  the  solutions  is  given  by 


"  -2 

-2 

2 

0  ' 

-1 

-3 

1 

0 

1 

-1 

-1 

0 
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The  reduced  row-echelon  form  is 


"  1 

0 

-1 

0  ' 

0 

1 

0 

0 

0 

0 

0 

0 

Therefore,  the  eigenvectors  are  of  the  form  t 


1 

0 

1 


where  t  ^  0  and  the  basic  eigenvector  is  given  by 


1 

0 

1 


We  can  verify  that  this  eigenvector  is  correct  by  checking  that  the  equation  AX\ 
product  AX i  is  given  by 


2  2-2' 

'  1  ' 

'  0  ' 

1  3  -1 

0 

= 

0 

-1  1  1 

1 

0 

()X\  holds.  The 


This  clearly  equals  OXj,  so  the  equation  holds.  Hence,  AX i  =  OXj  and  so  0  is  an  eigenvalue  of  A. 
Computing  the  other  basic  eigenvectors  is  left  as  an  exercise.  4k 


In  the  following  sections,  we  examine  ways  to  simplify  this  process  of  finding  eigenvalues  and  eigen¬ 
vectors  by  using  properties  of  special  types  of  matrices. 


7.1.3.  Eigenvalues  and  Eigenvectors  for  Special  Types  of  Matrices 


There  are  three  special  kinds  of  matrices  which  we  can  use  to  simplify  the  process  of  finding  eigenvalues 
and  eigenvectors.  Throughout  this  section,  we  will  discuss  similar  matrices,  elementary  matrices,  as  well 
as  triangular  matrices. 

We  begin  with  a  definition. 


It  turns  out  that  we  can  use  the  concept  of  similar  matrices  to  help  us  find  the  eigenvalues  of  matrices. 
Consider  the  following  lemma. 


Lemma  7.10:  Similar  Matrices  and  Eigenvalues 


Let  A  and  B  be  similar  matrices,  so  that  A  =  P  lBP  where  A,  B  are  n  x  n  matrices  and  P  is  invertible. 
Then  A,B  have  the  same  eigenvalues. 
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Proof.  We  need  to  show  two  things.  First,  we  need  to  show  that  if  A  —  P  1  BP,  then  A  and  B  have  the  same 
eigenvalues.  Secondly,  we  show  that  if  A  and  B  have  the  same  eigenvalues,  then  A  =  P  lBP. 

Here  is  the  proof  of  the  first  statement.  Suppose  A  =  P  1  BP  and  X  is  an  eigenvalue  of  A,  that  is 
AX  =  XX  for  some  X  ^  0.  Then 

P  1 BPX  =  XX 

and  so 

BPX = XPX 

Since  P  is  one  to  one  and  A  f  0,  it  follows  that  PX  f  0.  Here,  PX  plays  the  role  of  the  eigenvector  in 
this  equation.  Thus  X  is  also  an  eigenvalue  of  B.  One  can  similarly  verify  that  any  eigenvalue  of  B  is  also 
an  eigenvalue  of  A,  and  thus  both  matrices  have  the  same  eigenvalues  as  desired. 

Proving  the  second  statement  is  similar  and  is  left  as  an  exercise.  4k 

Note  that  this  proof  also  demonstrates  that  the  eigenvectors  of  A  and  B  will  (generally)  be  different. 
We  see  in  the  proof  that  AX  =  XX,  while  B(PX )  =  X  (PX).  Therefore,  for  an  eigenvalue  X,  A  will  have 
the  eigenvector  X  while  B  will  have  the  eigenvector  PX. 

The  second  special  type  of  matrices  we  discuss  in  this  section  is  elementary  matrices.  Recall  from 
Definition  2.43  that  an  elementary  matrix  E  is  obtained  by  applying  one  row  operation  to  the  identity 
matrix. 

It  is  possible  to  use  elementary  matrices  to  simplify  a  matrix  before  searching  for  its  eigenvalues  and 
eigenvectors.  This  is  illustrated  in  the  following  example. 


Example  7.11:  Simplify  Using  Elementary  Matrices 


Find  the  eigenvalues  for  the  matrix 


33 

105 

105 

10 

28 

30 

-20 

-60 

-62 

Solution.  This  matrix  has  big  numbers  and  therefore  we  would  like  to  simplify  as  much  as  possible  before 
computing  the  eigenvalues. 

We  will  do  so  using  row  operations.  First,  add  2  times  the  second  row  to  the  third  row.  To  do  so,  left 
multiply  A  by  E  (2, 2) .  Then  right  multiply  A  by  the  inverse  of  E  (2, 2)  as  illustrated. 


"  1 

0 

0  ' 

33 

105 

105  ' 

'  1 

0 

0  ' 

"  33 

-105 

105  ' 

0 

1 

0 

10 

28 

30 

0 

1 

0 

= 

10 

-32 

30 

0 

2 

1 

-20 

-60 

-62 

0 

-2 

1 

0 

0 

-2 

By  Lemma  7.10,  the  resulting  matrix  has  the  same  eigenvalues  as  A  where  here,  the  matrix  E  (2,2)  plays 
the  role  of  P. 

We  do  this  step  again,  as  follows.  In  this  step,  we  use  the  elementary  matrix  obtained  by  adding  —3 
times  the  second  row  to  the  first  row. 


"  1 

-3 

0  ' 

"  33 

-105 

105  ' 

'  1 

3 

0  ' 

3 

0 

15  ' 

0 

1 

0 

10 

-32 

30 

0 

1 

0 

= 

10 

-2 

30 

0 

0 

1 

0 

0 

-2 

0 

0 

1 

0 

0 

-2 

(7.4) 
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Again  by  Lemma  7.10,  this  resulting  matrix  has  the  same  eigenvalues  as  A.  At  this  point,  we  can  easily 
find  the  eigenvalues.  Let 


B  = 


3  0  15 

10  -2  30 

0  0-2 


Then,  we  find  the  eigenvalues  of  B  (and  therefore  of  A)  by  solving  the  equation  det  (x7  —  B)  =0.  You 
should  verify  that  this  equation  becomes 


(x  +  2)  (x  +  2)  (x  —  3)  =  0 

Solving  this  equation  results  in  eigenvalues  of  X\  =  — 2,  A2  =  —2,  and  A3  =  3.  Therefore,  these  are  also 
the  eigenvalues  of  A. 

* 

Through  using  elementary  matrices,  we  were  able  to  create  a  matrix  for  which  finding  the  eigenvalues 
was  easier  than  for  A.  At  this  point,  you  could  go  back  to  the  original  matrix  A  and  solve  (A I  —  A) X  =  0 
to  obtain  the  eigenvectors  of  A. 

Notice  that  when  you  multiply  on  the  right  by  an  elementary  matrix,  you  are  doing  the  column  op¬ 
eration  defined  by  the  elementary  matrix.  In  7.4  multiplication  by  the  elementary  matrix  on  the  right 
merely  involves  taking  three  times  the  first  column  and  adding  to  the  second.  Thus,  without  referring  to 
the  elementary  matrices,  the  transition  to  the  new  matrix  in  7.4  can  be  illustrated  by 


"  33 

-105 

105  ' 

3 

-9 

15  ' 

3 

0 

15  ' 

10 

-32 

30 

-+ 

10 

-32 

30 

-+ 

10 

-2 

30 

0 

0 

-2 

0 

0 

-2 

0 

0 

-2 

The  third  special  type  of  matrix  we  will  consider  in  this  section  is  the  triangular  matrix.  Recall  Defi¬ 
nition  3.12  which  states  that  an  upper  (lower)  triangular  matrix  contains  all  zeros  below  (above)  the  main 
diagonal.  Remember  that  finding  the  determinant  of  a  triangular  matrix  is  a  simple  procedure  of  taking 
the  product  of  the  entries  on  the  main  diagonal..  It  turns  out  that  there  is  also  a  simple  way  to  find  the 
eigenvalues  of  a  triangular  matrix. 

In  the  next  example  we  will  demonstrate  that  the  eigenvalues  of  a  triangular  matrix  are  the  entries  on 
the  main  diagonal. 


Example  7.12:  Eigenvalues  for  a  Triangular  Matrix 

Let  A  — 

'12  4' 
0  4  7 

0  0  6 

.  Find  the  eigenvalues  of  A. 

Solution.  We  need  to  solve  the  equation  det  (x/  —  A)  =  0  as  follows 

"x-1  -2  -4 

0  x  —  4  -7 

0  0  x  — 6 


det(x7— A)  =  det 


(x—  1)  (x  —  4)  (x  —  6)  =  0 
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Solving  the  equation  (x  —  1)  (x  —  4)  (x  —  6)  =0  for  x  results  in  the  eigenvalues  Ai  =  =  4  and 

A3  =  6.  Thus  the  eigenvalues  are  the  entries  on  the  main  diagonal  of  the  original  matrix.  4k 

The  same  result  is  true  for  lower  triangular  matrices.  For  any  triangular  matrix,  the  eigenvalues  are 
equal  to  the  entries  on  the  main  diagonal.  To  find  the  eigenvectors  of  a  triangular  matrix,  we  use  the  usual 
procedure. 

In  the  next  section,  we  explore  an  important  process  involving  the  eigenvalues  and  eigenvectors  of  a 
matrix. 


Exercises 


Exercise  7.1.1  If  A  is  an  invertible  n  x  n  matrix,  compare  the  eigenvalues  of  A  and  A  1 .  More  generally, 
for  m  an  arbitrary  integer,  compare  the  eigenvalues  of  A  and  Am. 

Exercise  7.1.2  If  A  is  an  n  x  n  matrix  and  c  is  a  nonzero  constant,  compare  the  eigenvalues  of  A  and  cA. 

Exercise  7.1.3  Let  A,  B  be  invertible  n  x  n  matrices  which  commute.  That  is,  AB  =  BA.  Suppose  X  is  an 
eigenvector  ofB.  Show  that  then  AX  must  also  be  an  eigenvector  for  B. 


Exercise  7.1.4  Suppose  A  is  an  n  x  n  matrix  and  it  satisfies  A'n  —  A  for  some  m  a  positive  integer  larger 
than  1.  Show  that  iff  is  an  eigenvalue  of  A  then  A  equals  either  0  or  1. 


Exercise  7.1.5  Show  that  if  AX  =  XX  and  AY  =  AT,  then  whenever  k,p  are  scalars, 

A  (kX  +  pY)  =  A  (kX  +  pY ) 

Does  this  imply  that  kX  +  pY  is  an  eigenvector?  Explain. 


Exercise  7.1.6  Suppose  A  is  a  3x3  matrix  and  the  following  information  is  available. 


A 


A 


A 


0 

-1 

-1 

"  1 
1 
1 

-2 

-3 

-2 


=  0 

=  -2 

=  -2 


0 

-1 

-1 

"  1  ' 
1 
1 

"  -2 
-3 
-2 


1 

-4 

3 


Find  A 
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Exercise  7.1.7  Suppose  A  is  a  3x3  matrix  and  the  following  information  is  available. 


-1  ' 

"  -1 

A 

-2 

=  1 

-2 

-2 

-2 

'  1  ' 

"  1  ' 

A 

1 

=  0 

1 

1 

1 

-1  ' 

"  -1 

A 

-4 

=  2 

-4 

-3 

-3 

Find  A 


3 

-4 

3 


Exercise  7.1.8  Suppose  A  is  a  3x3  matrix  and  the  following  information  is  available. 


A 


A 


A 


0 

-1 

-1 

"  1 
1 
1 

-3 

-5 

-4 


2 


1 


0 

-1 

-1 

1  ' 

1 

1 


-3 


-3 

-5 

-4 


Find  A 


2 

-3 

3 


Exercise  7.1.9  Find  the  eigenvalues  and  eigenvectors  of  the  matrix 


-6 

-92 

12 

0 

0 

0 

-2 

-31 

4 

One  eigenvalue  is  —2. 

Exercise  7.1.10  Find  the  eigenvalues  and  eigenvectors  of  the  matrix 

"  -2  -17  -6  ' 

0  0  0 
1  9  3 


One  eigenvalue  is  1. 
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Exercise  7.1.11  Find  the  eigenvalues  and  eigenvectors  of  the  matrix 

"9  2  8  ' 

2  -6  -2 

_  -8  2  -5  _ 

One  eigenvalue  is  —3. 

Exercise  7.1.12  Find  the  eigenvalues  and  eigenvectors  of  the  matrix 


6 

76 

16 

2 

-21 

-4 

2 

64 

17 

One  eigenvalue  is  —2. 

Exercise  7.1.13  Find  the  eigenvalues  and  eigenvectors  of  the  matrix 

'  3  5  2  " 

-8  -11  -4 

10  11  3  . 

One  eigenvalue  is  -3. 

Exercise  7.1.14  Is  it  possible  for  a  nonzero  matrix  to  have  only  0  as  an  eigenvalue? 

Exercise  7.1.15  If  A  is  the  matrix  of  a  linear  transformation  which  rotates  all  vectors  in  R2  through  60°, 
explcdn  why  A  cannot  have  any  real  eigenvalues.  Is  there  an  angle  such  that  rotation  through  this  angle 
would  have  a  real  eigenvalue?  What  eigenvalues  would  be  obtainable  in  this  way? 


Exercise  7.1.16  Let  A  be  the  2x2  matrix  of  the  linear  transformation  which  rotates  all  vectors  in  R2 
through  an  angle  of  9.  For  which  values  of  6  does  A  have  a  real  eigenvalue? 

Exercise  7.1.17  Let  T  be  the  linear  transformation  which  reflects  vectors  about  the  x  axis.  Find  a  matrix 
for  T  and  then  find  its  eigenvalues  and  eigenvectors. 

Exercise  7.1.18  Let  T  be  the  linear  transformation  which  rotates  all  vectors  in  R2  counterclockwise 
through  an  angle  oJ'k/2.  Find  a  matrix  ofT  and  then  find  eigenvalues  and  eigenvectors. 

Exercise  7.1.19  Let  T  be  the  linear  transformation  which  reflects  all  vectors  in  R3  through  the  xy  plane. 
Find  a  matrix  for  T  and  then  obtain  its  eigenvalues  and  eigenvectors. 
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7.2  Diagonalization 


7.2.1.  Similarity  and  Diagonalization 


We  begin  this  section  by  recalling  the  definition  of  similar  matrices.  Recall  that  if  A, B  are  two  n  x  n 
matrices,  then  they  are  similar  if  and  only  if  there  exists  an  invertible  matrix  P  such  that 

A  =  P  lBP 

In  this  case  we  write  A  ~  B.  The  concept  of  similarity  is  an  example  of  an  equivalence  relation. 


Proof.  It  is  clear  that  A  ~  A,  taking  P  —  I. 

Now,  if  A  ~  B,  then  for  some  P  invertible, 

A  =  P  XBP 

and  so 

PAP  x  =  B 

But  then 

(p-'y'AP-1 =B 

which  shows  that  B 

Now  suppose  A  ~  B  and  B  ~  C.  Then  there  exist  invertible  matrices  P,  Q  such  that 

A  =  P~XBP,  B  =  Q  lCQ 

Then, 

A  =  P~l  ( Q  lCQ )  P  =  ( QP)~lC{QP ) 
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showing  that  A  is  similar  to  C.  4k 

Another  important  concept  necessary  to  this  section  is  the  trace  of  a  matrix.  Consider  the  definition. 


In  words,  the  trace  of  a  matrix  is  the  sum  of  the  entries  on  the  main  diagonal. 


The  following  theorem  includes  a  reference  to  the  characteristic  polynomial  of  a  matrix.  Recall  that 
for  any  n  x  n  matrix  A,  the  characteristic  polynomial  of  A  is  ca  (x)  =  det (xl  —  A) . 


We  now  proceed  to  the  main  concept  of  this  section.  When  a  matrix  is  similar  to  a  diagonal  matrix,  the 
matrix  is  said  to  be  diagonalizable.  We  define  a  diagonal  matrix  D  as  a  matrix  containing  a  zero  in  every 
entry  except  those  on  the  main  diagonal.  More  precisely,  if  dtj  is  the  i jth  entry  of  a  diagonal  matrix  D, 
then  dij  =  0  unless  i  —  j.  Such  matrices  look  like  the  following. 


* 


D  = 


0 

* 


0 
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where  *  is  a  number  which  might  not  be  zero. 

The  following  is  the  formal  definition  of  a  diagonalizable  matrix. 


Notice  that  the  above  equation  can  be  rearranged  as  A  —  POP  1 .  Suppose  we  wanted  to  compute 
A100.  By  diagonalizing  A  first  it  suffices  to  then  compute  ( PDP~l )  10°,  which  reduces  to  PDl00P  _1.  This 
last  computation  is  much  simpler  than  A100.  While  this  process  is  described  in  detail  later,  it  provides 
motivation  for  diagonalization. 

7.2.2.  Diagonalizing  a  Matrix 


The  most  important  theorem  about  diagonalizability  is  the  following  major  result. 


Theorem  7.18:  Eigenvectors  and  Diagonalizable  Matrices 


An  n  x  n  matrix  A  is  diagonalizable  if  and  only  if  there  is  an  invertible  matrix  P  given  by 

P  =  {  X \  X2  •••  Xn  } 

where  theX k  are  eigenvectors  of  A. 

Moreover  if  A  is  diagonalizable,  the  coiresponding  eigenvalues  of  A  are  the  diagonal  entries  of  the 
diagonal  matrix  D. 


Proof.  Suppose  P  is  given  as  above  as  an  invertible  matrix  whose  columns  are  eigenvectors  of  A.  Then 
P  1  is  of  the  form 

'  Wj  ' 

Wl 

p  1  =  "2 

WT 

_  n 

where  \V/.  Xj  =  8k j,  which  is  the  Kronecker’s  symbol  defined  by 


1  if  i  =  j 

0  if  M  J 


Then 


'  Wf  ' 

wl 

wnT 


P  lAP 


[  AX\  AX2  •••  AXn] 
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Wf 

Wj 


w, 


[  AjXj  A2X2  •  •  •  A nXn  ] 


0 


A,. 


Conversely,  suppose  A  is  diagonalizable  so  that  P  lAP  —  D.  Let 

•  Xn  ] 


where  the  columns  are  the  Xi-  and 


P=[X  i  X2 
Ai 


£>  = 


Then 


AP  =  PD=[Xx  X2  •••  Xn] 


Ai 


and  so 


[  AX\  AX2  •••  AXn]  =  [  AiXi  A2X2 
showing  the  X \  are  eigenvectors  of  A  and  the  A/t.  are  eigenvectors. 


0 


A  „ 

AmXm  ] 


Notice  that  because  the  matrix  P  defined  above  is  invertible  it  follows  that  the  set  of  eigenvectors  of  A, 
{Xi,X2,-  ■  ■  ,Xn},  form  a  basis  of  W\ 

We  demonstrate  the  concept  given  in  the  above  theorem  in  the  next  example.  Note  that  not  only  are 
the  columns  of  the  matrix  P  formed  by  eigenvectors,  but  P  must  be  invertible  so  must  consist  of  a  wide 
variety  of  eigenvectors.  We  achieve  this  by  using  basic  eigenvectors  for  the  columns  of  P. 


Solution.  By  Theorem  7.18  we  use  the  eigenvectors  of  A  as  the  columns  of  P,  and  the  corresponding 
eigenvalues  of  A  as  the  diagonal  entries  of  D. 

First,  we  will  find  the  eigenvalues  of  A.  To  do  so,  we  solve  det  (xi  —  A)  —  0  as  follows. 


( 

'  1 

0 

0  ' 

det  x 

0 

1 

0 

— 

V 

0 

0 

1 

0  0 
4  -1 
-4  4 


=  0 
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This  computation  is  left  as  an  exercise,  and  you  should  verify  that  the  eigenvalues  are  Ai  =  2,  A2  —  2, 
and  A3  =  6. 

Next,  we  need  to  find  the  eigenvectors.  We  first  find  the  eigenvectors  for  Ai,  A2  =  2.  Solving  (21  —  A)  X  = 
0  to  find  the  eigenvectors,  we  find  that  the  eigenvectors  are 


'  -2  ' 

'  1  ' 

t 

1 

+  5 

0 

0 

1 

where  t,s  are  scalars.  Hence  there  are  two  basic  eigenvectors  which  are  given  by 


"  -2  ' 

"  1 ' 

Xi  = 

1 

,X2  = 

0 

0 

1 

You  can  verify  that  the  basic  eigenvector  for  A3  =  6  is  X3 
Then,  we  construct  the  matrix  P  as  follows. 


0 

1 

-2 


P=[X!  x2  x3  ] 


-2  1  0 

1  0  1 

0  1  -2 


That  is,  the  columns  of  P  are  the  basic  eigenvectors  of  A.  Then,  you  can  verify  that 


P~l  = 


4  1 


Thus, 


P~lAP 


1 

to 

0 

0 

" -2  1  O' 

1  4  -1 

1  0  1 

-2  -4  4 

0  1  -2 

2  0  0 
0  2  0 
0  0  6 


You  can  see  that  the  result  here  is  a  diagonal  matrix  where  the  entries  on  the  main  diagonal  are  the 
eigenvalues  of  A.  We  expected  this  based  on  Theorem  7.18.  Notice  that  eigenvalues  on  the  main  diagonal 
must  be  in  the  same  order  as  the  corresponding  eigenvectors  in  P.  4k 


Consider  the  next  important  theorem. 
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Theorem  7.20:  Linearly  Independent  Eigenvectors 


Let  A  be  an  n  x  n  matrix,  and  suppose  that  A  has  distinct  eigenvalues  Ai,  A2, . . .  ,Am.  For  each  i,  let 
X,  be  a  A/- -eigenvector  of  A.  Then  {X\,Xi, . . .  ,Xm)  is  linearly  independent. 


The  corollary  that  follows  from  this  theorem  gives  a  useful  tool  in  determining  if  A  is  diagonalizable. 


Corollary  7.21:  Distinct  Eigenvalues 


Let  A  be  an  nx  n  matrix  and  suppose  it  has  n  distinct  eigenvalues.  Then  it  follows  that  A  is  diago¬ 
nalizable. 


It  is  possible  that  a  matrix  A  cannot  be  diagonalized.  In  other  words,  we  cannot  find  an  invertible 
matrix  P  so  that  P  lAP  =  D. 

Consider  the  following  example. 


Solution.  Through  the  usual  procedure,  we  find  that  the  eigenvalues  of  A  are  Ai  =  1 ,  A2  =  1 .  To  find  the 
eigenvectors,  we  solve  the  equation  (A I  —  A)X  =  0.  The  matrix  (A/  — A)  is  given  by 

A  -  1  -1 

0  A  - 1 


Substituting  in  A  =  1,  we  have  the  matrix 


'1-1 

-1 

'  0 

-1  ' 

0 

1-1 

0 

0 

Then,  solving  the  equation  (A I—A)X 
reduced  row-echelon  form. 


0  -1 

0  0 


=  0  involves  carrying  the  following  augmented  matrix  to  its 


0 

0 


->■ - y 


0-10 
0  0  0 


Then  the  eigenvectors  are  of  the  form 


Xi  = 


1 

0 


and  the  basic  eigenvector  is 
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In  this  case,  the  matrix  A  has  one  eigenvalue  of  multiplicity  two,  but  only  one  basic  eigenvector.  In 
order  to  diagonalize  A,  we  need  to  construct  an  invertible  2x2  matrix  P.  However,  because  A  only  has 
one  basic  eigenvector,  we  cannot  construct  this  P.  Notice  that  if  we  were  to  use  X\  as  both  columns  of  P, 
P  would  not  be  invertible.  For  this  reason,  we  cannot  repeat  eigenvectors  in  P. 

Hence  this  matrix  cannot  be  diagonalized.  4k 

The  idea  that  a  matrix  may  not  be  diagonalizable  suggests  that  conditions  exist  to  determine  when  it 
is  possible  to  diagonalize  a  matrix.  We  saw  earlier  in  Corollary  7.21  that  an  n  x  n  matrix  with  n  distinct 
eigenvalues  is  diagonalizable.  It  turns  out  that  there  are  other  useful  diagonalizability  tests. 

First  we  need  the  following  definition. 


Definition  7.23:  Eigenspace 


Let  A  be  an  nx  n  matrix  and  X  G  M.  The  eigenspace  of  A  corresponding  to  X,  written  L;l(A)  is  the 
set  of  all  eigenvectors  corresponding  to  X . 


In  other  words,  the  eigenspace  (A)  is  all  X  such  that  AX  =  XX.  Notice  that  this  set  can  be  written 
E^(A')  —  null(A/— A),  showing  that  E^(A)  is  a  subspace  of  J8Ln. 

Recall  that  the  multiplicity  of  an  eigenvalue  X  is  the  number  of  times  that  it  occurs  as  a  root  of  the 
characteristic  polynomial. 

Consider  now  the  following  lemma. 


This  result  tells  us  that  if  X  is  an  eigenvalue  of  A,  then  the  number  of  linearly  independent  A -eigenvectors 
is  never  more  than  the  multiplicity  of  X .  We  now  use  this  fact  to  provide  a  useful  diagonalizability  condi¬ 
tion. 


Theorem  7.25:  Diagonalizability  Condition 


Let  A  be  an  n  x  n  matrix  A.  Then  A  is  diagonalizable  if  and  only  if  for  each  eigenvalue  X  of  A, 
dim(£^  (A))  is  equal  to  the  multiplicity  of  X . 
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7.2.3.  Complex  Eigenvalues 


In  some  applications,  a  matrix  may  have  eigenvalues  which  are  complex  numbers.  For  example,  this  often 
occurs  in  differential  equations.  These  questions  are  approached  in  the  same  way  as  above. 

Consider  the  following  example. 


Example  7.26:  A  Real  Matrix  with  Complex  Eigenvalues 

Let 

"10  0  ' 

A  = 

0  2-1 

0  1  2 

Find  the  eigenvalues  and  eigenvectors  of  A. 

1 

Solution.  We  will  first  find  the  eigenvalues  as  usual  by  solving  the  following  equation. 


l 

1 

0 

0 

1 

0 

0 

\ 

( * 

0  1  0 

— 

0  2-1 

V 

_  0  0  1  _ 

0  1  2 

/ 

This  reduces  to  (jc  —  1)  (x2  —  4x  +  5)  =0.  The  solutions  are  X\  =  1 ,  A2  =  2  +  /  and  A3  =  2  —  i. 

There  is  nothing  new  about  finding  the  eigenvectors  for  Ai  =  1  so  this  is  left  as  an  exercise. 

Consider  now  the  eigenvalue  A2  =  2  +  i.  As  usual,  we  solve  the  equation  (XI  —  A)  X  —  0  as  given  by 


/ 

"10  0" 

"10  o' 

\ 

"  0  ' 

(2  +  0 

0  1  0 

— 

0  2-1 

W= 

0 

V 

0  0  1 

0  1  2 

) 

0 

In  other  words,  we  need  to  solve  the  system  represented  by  the  augmented  matrix 


1  +  i 

0 

0 

0  ' 

0 

i 

1 

0 

0 

-1 

i 

0 

We  now  use  our  row  operations  to  solve  the  system.  Divide  the  first  row  by  (1  +  i)  and  then  take  —i 
times  the  second  row  and  add  to  the  third  row.  This  yields 


'  1 

0 

0 

0  ' 

0 

i 

1 

0 

0 

0 

0 

0 

Now  multiply  the  second  row  by  —i  to  obtain  the  reduced  row-echelon  form,  given  by 


'  1 

0 

0 

0  ' 

0 

1 

— i 

0 

0 

0 

0 

0 

370  Spectral  Theory 


Therefore,  the  eigenvectors  are  of  the  form 

"  0 

t  i 

1 


and  the  basic  eigenvector  is  given  by 


*2  = 


0 

i 

1 


As  an  exercise,  verify  that  the  eigenvectors  for  A3  =  2  —  i  are  of  the  form 


0 

t  —i 

1 


Hence,  the  basic  eigenvector  is  given  by 


*3 


0 

—i 

1 


As  usual,  be  sure  to  check  your  answers!  To  verify,  we  check  that  AX 3  —  (2  —  i)X 3  as  follows. 


1 

0 

0 

1 

O 

1 

O 

_ 1 

1 - 

O 

0  2-1 

0  1  2 

—i 

1 

— 

—  1—2/ 
2-i 

=  (2-0 

—i 

1 

Therefore,  we  know  that  this  eigenvector  and  eigenvalue  are  correct.  4k 

Notice  that  in  Example  7.26,  two  of  the  eigenvalues  were  given  by  A2  =  2  +  i  and  A3  =  2  —  i.  You  may 
recall  that  these  two  complex  numbers  are  conjugates.  It  turns  out  that  whenever  a  matrix  containing  real 
entries  has  a  complex  eigenvalue  A,  it  also  has  an  eigenvalue  equal  to  A,  the  conjugate  of  A. 


Exercises 


Exercise  7.2.1  Find  the  eigenvalues  and  eigenvectors  of  the  matrix 


5 

-18 

-32 

0 

5 

4 

2 

-5 

-11 

One  eigenvalue  is  1.  Diagonalize  if  possible. 

Exercise  7.2.2  Find  the  eigenvalues  and  eigenvectors  of  the  matrix 

"  -13  -28  28  ' 

4  9-8 

-4  -8  9 
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One  eigenvalue  is  3.  Diagonalize  if  possible. 

Exercise  7.2.3  Find  the  eigenvalues  and  eigenvectors  of  the  matrix 

89  38  268  ' 

14  2  40 

_  -30  -12  -90  _ 

One  eigenvalue  is  —3.  Diagonalize  if  possible. 

Exercise  7.2.4  Find  the  eigenvalues  and  eigenvectors  of  the  matrix 

"  1  90  O' 

0-2  0 
_  3  89  -2  _ 

One  eigenvalue  is  1.  Diagonalize  if  possible. 

Exercise  7.2.5  Find  the  eigenvalues  and  eigenvectors  of  the  matrix 

11  45  30  ' 

10  26  20 
-20  -60  -44 

One  eigenvalue  is  1.  Diagonalize  if  possible. 

Exercise  7.2.6  Find  the  eigenvalues  and  eigenvectors  of  the  matrix 

95  25  24  ' 

-196  -53  -48 
-164  -42  -43 

One  eigenvalue  is  5.  Diagonalize  if  possible. 

Exercise  7.2.7  Suppose  A  is  an  n  x  n  matrix  and  let  V  be  an  eigenvector  such  that  AV  —  XV.  Also  suppose 
the  characteristic  polynomial  of  A  is 

det(x7  —  A)  =  x"  +  H - \-a\x  +  ao 


Explain  why 

(An  +  an-\An~l  +  •  •  •  +  axA  +  a0I )  V  =  0 

If  A  is  diagonalizable,  give  a  proof  of  the  Cayley  Hamilton  theorem  based  on  this.  This  theorem  says  A 
satisfies  its  characteristic  equation, 


A  an—  \A  4 -  •  •  •  4~  a  \A  4~  oq!  —  0 


Exercise  7.2.8  Suppose  the  characteristic  polynomial  of  an  n  x  n  matrix  A  is  1  — Xn .  Find  A'nn  where  m 
is  an  integer. 
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Exercise  7.2.9  Find  the  eigenvalues  and  eigenvectors  of  the  matrix 


15 

-24 

7 

-6 

5 

-1 

-58 

76 

-20 

One  eigenvalue  is  —2.  Diagonalize  if  possible.  Hint:  This  one  has  some  complex  eigenvalues. 
Exercise  7.2.10  Find  the  eigenvalues  and  eigenvectors  of  the  matrix 


15 

-25 

6 

-13 

23 

-4 

-91 

155 

-30 

One  eigenvalue  is  2.  Diagonalize  if  possible.  Hint:  This  one  has  some  complex  eigenvalues. 
Exercise  7.2.11  Find  the  eigenvalues  and  eigenvectors  of  the  matrix 


-11 

-12 

4 

8 

17 

-4 

-4 

28 

-3 

One  eigenvalue  is  1.  Diagonalize  if  possible.  Hint:  This  one  has  some  complex  eigenvalues. 
Exercise  7.2.12  Find  the  eigenvalues  and  eigenvectors  of  the  matrix 


14 

-12 

5 

-6 

2 

-1 

-69 

51 

-21 

One  eigenvalue  is  —3.  Diagonalize  if  possible.  Hint:  This  one  has  some  complex  eigenvalues. 

Exercise  7.2.13  Suppose  A  is  an  n  x  n  matrix  consisting  entirely  of  real  entries  but  a  +  ib  is  a  complex 
eigenvalue  having  the  eigenvector,  X  +  iY  Here  X  and  Y  are  real  vectors.  Show  that  then  a  —  ib  is  also  an 
eigenvalue  with  the  eigenvector,  X  —  iY.  Hint:  You  should  remember  that  the  conjugate  of  a  product  of 
complex  numbers  equals  the  product  of  the  conjugates.  Here  a  +  ib  is  a  complex  number  whose  conjugate 
equals  a  — ib. 
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Outcomes 


A.  Use  diagonalization  to  find  a  high  power  of  a  matrix. 

B.  Use  diagonalization  to  solve  dynamical  systems. 
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7.3.1.  Raising  a  Matrix  to  a  High  Power 


Suppose  we  have  a  matrix  A  and  we  want  to  find  A50.  One  could  try  to  multiply  A  with  itself  50  times,  but 
this  is  computationally  extremely  intensive  (try  it!).  However  diagonalization  allows  us  to  compute  high 
powers  of  a  matrix  relatively  easily.  Suppose  A  is  diagonalizable,  so  that  P  lAP  —  D.  We  can  rearrange 
this  equation  to  write  A  =  POP  1 . 

Now,  consider  A2.  Since  A  =  PDP  1 ,  it  follows  that 

A2  =  {PDP-lf  =  PDP-'PDP-1  =  PD2P~l 

Similarly, 

A3  =  (PDP-1)3  =  PDP~lPDP  lPDP~l  =  PD3P  1 

In  general, 

A"  =  (PDP-1)’1  =  PD"P  1 

Therefore,  we  have  reduced  the  problem  to  finding  D" .  In  order  to  compute  D'\  then  because  D  is 
diagonal  we  only  need  to  raise  every  entry  on  the  main  diagonal  of  D  to  the  power  of  n. 

Through  this  method,  we  can  compute  large  powers  of  matrices.  Consider  the  following  example. 


Example  7.27:  Raising  a  Matrix  to  a  High  Power 

Let  A  = 

'2  10' 
0  1  0 
-1  -1  1 

.  Find  A50. 

Solution.  We  will  first  diagonalize  A.  The  steps  are  left  as  an  exercise  and  you  may  wish  to  verify  that  the 
eigenvalues  of  A  are  Ai  —  1,3,2  =  1,  and  A3  =  2. 

The  basic  eigenvectors  corresponding  to  Ai ,  A2  =  1  are 


'  0  ' 

'  -1 ' 

X\  — 

0 

,X2  = 

1 

1 

0 

The  basic  eigenvector  corresponding  to  A3  =  2  is 


X3 


-1 

0 

1 


Now  we  construct  P  by  using  the  basic  eigenvectors  of  A  as  the  columns  of  P.  Thus 


P=[X t  Xi  X3  ] 


0  -1  -1 

0  1  0 

1  0  1 
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Then  also 


which  you  may  wish  to  verify. 
Then, 


P  1 


1  1  1 

0  1  0 

-1  -1  0 


1  11' 

1 

N> 

O 

_ i 

'  0  -1  -1  ' 

P  lAP  = 

0  1  0 

0  1  0 

0  1  0 

-1  -1  0 

-1  -1  1 

1  0  1 

1  0  0 
=  0  10 
0  0  2 

=  D 


Now  it  follows  by  rearranging  the  equation  that 


'  0 

-1 

-1  ' 

'  1 

0 

0  ' 

1 

1 

1 ' 

0 

1 

0 

0 

1 

0 

0 

1 

0 

1 

0 

1 

0 

0 

2 

-1 

-1 

0 

Therefore, 

A50  =  PD50P 


"  0  -1  -1  ' 

'10  0' 

50 

1  11' 

0  1  0 

0  1  0 

0  1  0 

1  0  1 

0  0  2 

-1  -1  0 

By  our  discussion  above,  D50  is  found  as  follows. 


"  1 

0 

0  ' 

50 

■  ^50 

0 

0  ' 

0 

1 

0 

= 

0 

}50 

0 

0 

0 

2 

0 

0 

25° 

It  follows  that 


'  0 

-1 

-1 ' 

-  }50 

0 

0  ' 

1 

1 

1 ' 

A50  = 

0 

1 

0 

0 

^50 

0 

0 

1 

0 

1 

0 

1 

0 

0 

25° 

-1 

-1 

0 

250  _i  +  250  O' 


0  1  0 
1  -  250  1  -  250  1 


* 

Through  diagonalization,  we  can  efficiently  compute  a  high  power  of  A.  Without  this,  we  would  be 
forced  to  multiply  this  by  hand! 

The  next  section  explores  another  interesting  application  of  diagonalization. 
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7.3.2.  Raising  a  Symmetric  Matrix  to  a  High  Power 


We  already  have  seen  how  to  use  matrix  diagonalization  to  compute  powers  of  matrices.  This  requires 
computing  eigenvalues  of  the  matrix  A,  and  finding  an  invertible  matrix  of  eigenvectors  P  such  that  P  lAP 
is  diagonal.  In  this  section  we  will  see  that  if  the  matrix  A  is  symmetric  (see  Definition  2.29),  then  we  can 
actually  find  such  a  matrix  P  that  is  an  orthogonal  matrix  of  eigenvectors.  Thus  P  1  is  simply  its  transpose 
PT ,  and  PtAP  is  diagonal.  When  this  happens  we  say  that  A  is  orthogonally  diagonalizable 

In  fact  this  happens  if  and  only  if  A  is  a  symmetric  matrix  as  shown  in  the  following  important  theorem. 


Theorem  7.28:  Principal  Axis  Theorem 


The  following  conditions  are  equivalent  for  an  n  x  n  matrix  A: 

1.  A  is  symmetric. 

2.  A  has  an  orthonormal  set  of  eigenvectors. 

3.  A  is  orthogonally  diagonalizable. 


Proof.  The  complete  proof  is  beyond  this  course,  but  to  give  an  idea  assume  that  A  has  an  orthonormal 
set  of  eigenvectors,  and  let  P  consist  of  these  eigenvectors  as  columns.  Then  P  1  =  PT ,  and  PT AP  —  Da 
diagonal  matrix.  But  then  A  =  PDPT ,  and 

At  =  (, PDPt)t  =  (Pt)tDtPt  =  PDPt  =  A 


so  A  is  symmetric. 

Now  given  a  symmetric  matrix  A,  one  shows  that  eigenvectors  corresponding  to  different  eigenvalues 
are  always  orthogonal.  So  it  suffices  to  apply  the  Gram-Schmidt  process  on  the  set  of  basic  eigenvectors 
of  each  eigenvalue  to  obtain  an  orthonormal  set  of  eigenvectors.  4k 

We  demonstrate  this  in  the  following  example. 


r  i 

Example  7.29:  Orthogonal  Diagonalization  of  a  Symmetric  Matrix 

Let  A  — 

'10  0' 

0  3  1 

u  2  2 

0  i  3 

u  2  2 

.  Find  an  orthogonal  matrix  P  such  that  PT AP  is  a  diagonal  matrix. 

Solution.  In  this  case,  verify  that  the  eigenvalues  are  2  and  1.  First  we  will  find  an  eigenvector  for  the 
eigenvalue  2.  This  involves  row  reducing  the  following  augmented  matrix. 


i 

N> 

1 

0 

0 

1 - 

o 

0 

2-- 
z  2 

1 

2 

0 

1 - 

O 

1 

2 

2—3 
z  2 

- 1 

o 
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'  1 

0 

0 

0  ' 

0 

1 

-1 

0 

0 

0 

0 

0 

The  reduced  row-echelon  form  is 


and  so  an  eigenvector  is 

0 
1 
1 

Finally  to  obtain  an  eigenvector  of  length  one  (unit  eigenvector)  we  simply  divide  this  vector  by  its  length 
to  yield: 

"O' 

f 

•Jl 

Next  consider  the  case  of  the  eigenvalue  1.  To  obtain  basic  eigenvectors,  the  matrix  which  needs  to  be 
row  reduced  in  this  case  is 


The  reduced  row-echelon  form  is 


Therefore,  the  eigenvectors  are  of  the  form 


'1-1 

0 

0 

0  ' 

0 

1-3 

1  2 

1 

2 

0 

0 

1 

2 

1-1 

0 

r 

0  1  1 

|0l 

0  0  0 
0  0  0 


Note  that  all  these  vectors  are  automatically  orthogonal  to  eigenvectors  corresponding  to  the  first  eigen¬ 
value.  This  follows  from  the  fact  that  A  is  symmetric,  as  mentioned  earlier. 

We  obtain  basic  eigenvectors 


"  1 ' 

0  ' 

0 

and 

-1 

0 

1 

Since  they  are  themselves  orthogonal  (by  luck  here)  we  do  not  need  to  use  the  Gram-Schmidt  process  and 
instead  simply  normalize  these  vectors  to  obtain 


'  1  ' 
0 

and 

1 

0 

.  J 

An  orthogonal  matrix  P  to  orthogonally  diagonalize  A  is  then  obtained  by  letting  these  basic  vectors  be 
the  columns. 

"010 


P  = 


—Lo-L 

s/2  "  s/2 

_L  n  _L 

s/2  s/2  J 
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We  verify  this  works.  PT  AP  is  of  the  form 


i_ 

4 


o  — 4 


4 


1  0  0 

0  X  X 

u  72  72 


■  1 

0 

0  ' 

r 

0 

3 

2 

1 

2 

0 

1 

2 

3 

2 

- 

0  1  0 

4  o  4 


f 

4 


4  o  4 


f 

4 


1  0  0 
0  1  0 
0  0  2 


which  is  the  desired  diagonal  matrix. 

We  can  now  apply  this  technique  to  efficiently  compute  high  powers  of  a  symmetric  matrix. 


* 


r  i 

Example  7.30:  Powers  of  a  Symmetric  Matrix 

Let  A  = 

'10  0' 

0  3  i 

o  2  2 

0  i  2 
u  2  2 

.  Compute  A1 . 

Solution.  We  found  in  Example  7.29  that  PT AP  =  I)  is  diagonal,  where 


P  = 


0  1  0 

4  o  4 


4 


4 


—  0  — 
V2  V2 


'  1 

0 

0  " 

and  D  = 

0 

1 

0 

0 

0 

2 

Thus  A  =  PDP 1  and  A7  =  PDF'  PDP 1  ■  ■  ■  PDF1  =  PD1  PT  which  gives: 


A7  = 


0 

_  j_ 

4 


0 

± 

f 

4 


1  0 
0  5 

0  4 


1  0 

-  0  ^ 

0  75 


1  0  0 
0  1  0 
0  0  2  _ 

1  0  0 
0  1  0 
0  0  27 


0 

T2 

4 


4  o  4 


4  o  4 


o 

± 

4  J 


0  — L  J_ 
W  vT  V2 


1  0 


1 


0 


o  X  J_ 
v  V2  V2 

o  — L  J_ 

u  vT  4 


0  0 


o  x  _L 

u  V2  4 

o-4  4 

u  V2  4 

1  0  0 

o  4  4 

u  V2  4 

1  0  0 


0 


2' +  1  27  — 1 


2—1  2' +  1 
2  2 


0 
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* 


7.3.3.  Markov  Matrices 


There  are  applications  of  great  importance  which  feature  a  special  type  of  matrix.  Matrices  whose  columns 
consist  of  non-negative  numbers  that  sum  to  one  are  called  Markov  matrices.  An  important  application 
of  Markov  matrices  is  in  population  migration,  as  illustrated  in  the  following  definition. 


Definition  7.31:  Migration  Matrices 


Let  m  locations  be  denoted  by  the  numbers  1,2,  ■  •  ■  ,m.  Suppose  it  is  the  case  that  each  year  the 
proportion  of  residents  in  location  j  which  move  to  location  i  is  ciij.  Also  suppose  no  one  escapes  or 
emigrates  from  without  these  m  locations.  This  last  assumption  requires  Y,iaij  —  1-  and  means  that 
the  matrix  A,  such  that  A  =  [a^-] ,  is  a  Markov  matrix.  In  this  context,  A  is  also  called  a  migration 
matrix. 


Consider  the  following  example  which  demonstrates  this  situation. 


Example  7.32:  Migration  Matrix 


Let  A  be  a  Markov  matrix  given  by 


Verify  that  A  is  a  Markov  matrix  and  describe  the  entries  of  A  in  terms  of  population  migration. 


Solution.  The  columns  of  A  are  comprised  of  non-negative  numbers  which  sum  to  1 .  Hence,  A  is  a  Markov 
matrix. 

Now,  consider  the  entries  a,y  of  A  in  terms  of  population.  The  entry  an  =  .4  is  the  proportion  of 
residents  in  location  one  which  stay  in  location  one  in  a  given  time  period.  Entry  a2\  =  .6  is  the  proportion 
of  residents  in  location  1  which  move  to  location  2  in  the  same  time  period.  Entry  a\2  —  .2  is  the  proportion 
of  residents  in  location  2  which  move  to  location  1.  Finally,  entry  022  =  .8  is  the  proportion  of  residents 
in  location  2  which  stay  in  location  2  in  this  time  period. 

Considered  as  a  Markov  matrix,  these  numbers  are  usually  identified  with  probabilities.  Hence,  we 
can  say  that  the  probability  that  a  resident  of  location  one  will  stay  in  location  one  in  the  time  period  is  .4. 

* 

Observe  that  in  Example  7.32  if  there  was  initially  say  15  thousand  people  in  location  1  and  10  thou¬ 
sands  in  location  2,  then  after  one  year  there  would  be  .4  x  15  +  .2  x  10  =  8  thousands  people  in  location 
1  the  following  year,  and  similarly  there  would  be  ,6x  15  +  .8  x  10  =  17  thousands  people  in  location  2 
the  following  year. 

More  generally  let  Xn  —  [x\n  ■  ■  -xmn]  where  xm  is  the  population  of  location  i  at  time  period  n.  We 
call  Xn  the  state  vector  at  period  n.  In  particular,  we  call  Xo  the  initial  state  vector.  Letting  A  be  the 
migration  matrix,  we  compute  the  population  in  each  location  i  one  time  period  later  by  AXn.  In  order  to 


7.3.  Applications  of  Spectral  Theory  379 


find  the  population  of  location  i  after  k  years,  we  compute  the  ith  component  of  AkX.  This  discussion  is 
summarized  in  the  following  theorem. 


Theorem  7.33:  State  Vector 


Let  A  be  the  migration  matrix  of  a  population  and  let  Xn  be  the  vector  whose  entries  give  the 
population  of  each  location  at  time  period  n.  Then  Xn  is  the  state  vector  at  period  n  and  it  follows 
that 

Xn+ 1  =  AXn 


The  sum  of  the  entries  of  Xn  will  equal  the  sum  of  the  entries  of  the  initial  vector  Xq.  Since  the  columns 
of  A  sum  to  1,  this  sum  is  preserved  for  every  multiplication  by  A  as  demonstrated  below. 


LL  "//-'■./  =  Lv/ 

i  i  j 


Consider  the  following  example. 


Example  7.34:  Using  a  Migration  Matrix 


Consider  the  migration  matrix 


A 


.6  0  .1 

.2  .8  0 

.2  .2  .9 


for  locations  1 , 2,  and  3.  Suppose  initially  there  are  100  residents  in  location  1 ,  200  in  location  2  and 
400  in  location  3.  Find  the  population  in  the  three  locations  after  1,2,  and  10  units  of  time. 


Solution.  Using  Theorem  7.33  we  can  find  the  population  in  each  location  using  the  equation  Xn+\  =  AXn. 
For  the  population  after  1  unit,  we  calculate  X\  —  AX o  as  follows. 


Vi  =  AXq 


Xu 

'  .6  0  .1  ' 

'  100  ' 

*21 

= 

o 

00 

<N 

200 

.  *31  . 

.2  .2  .9 

400 

100 

180 

420 


Therefore  after  one  time  period,  location  1  has  100  residents,  location  2  has  180,  and  location  3  has  420. 
Notice  that  the  total  population  is  unchanged,  it  simply  migrates  within  the  given  locations.  We  find  the 
locations  after  two  time  periods  in  the  same  way. 


X2  =  AX\ 


*12 

'  .6  0  .1  ' 

'  100  " 

*22 

= 

.2  .8  0 

180 

.  *32  _ 

.2  .2  .9 

420 
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102 

=  164 

434 

We  could  progress  in  this  manner  to  find  the  populations  after  10  time  periods.  However  from  our 
above  discussion,  we  can  simply  calculate  (AnX o)  ■,  where  n  denotes  the  number  of  time  periods  which 
have  passed.  Therefore,  we  compute  the  populations  in  each  location  after  10  units  of  time  as  follows. 

Xio  =  AWX. o 


*110 

'  .6 

0 

.1  ' 

10 

'  100  " 

T210 

= 

.2 

.8 

0 

200 

.  *3 10  . 

.2 

.2  .9  _ 

400 

115.085  82922 
=  120.13067244 

464.78349834 

Since  we  are  speaking  about  populations,  we  would  need  to  round  these  numbers  to  provide  a  logical 
answer.  Therefore,  we  can  say  that  after  10  units  of  time,  there  will  be  115  residents  in  location  one,  120 
in  location  two,  and  465  in  location  three.  4k 

A  second  important  application  of  Markov  matrices  is  the  concept  of  random  walks.  Suppose  a  walker 
has  m  locations  to  choose  from,  denoted  1,2, ,m.  Let  al}  refer  to  the  probability  that  the  person  will 
travel  to  location  i  from  location  /'.  Again,  this  requires  that 

k 

^  aij  ~  1 
i=  1 

rr 

In  this  context,  the  vector  Xn  —  [x\n  ■  ■  •xmn\  contains  the  probabilities  xin  the  walker  ends  up  in  location 
i,  1  <  i  <  m  at  time  n. 


Example  7.35:  Random  Walks 


Suppose  three  locations  exist,  referred  to  as  locations  1,2  and  3.  The  Markov  matrix  of  probabilities 
A  =  [cijj]  is  given  by 


0.4 

0.1 

0.5 

0.4 

0.6 

0.1 

0.2 

0.3 

0.4 

If  the  walker  starts  in  location  1 ,  calculate  the  probability  that  he  ends  up  in  location  3  at  time  n  —  2. 


Solution.  Since  the  walker  begins  in  location  1,  we  have 


X0  = 


1 

0 

0 


The  goal  is  to  calculate  X32.  To  do  this  we  calculate  Xj,  using  Xn+  \  =  AXn. 
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0.4 

0.1 

0.5 

0.4 

0.6 

0.1 

0.2 

0.3 

0.4 

0.4  ' 

0.4 

0.2 

1 

0 

0 


X2  =  AX  i 


o 

o 

o 

'  0.4  ' 

0.4  0.6  0.1 

0.4 

0.2  0.3  0.4 

0.2 

0.3 

0.42 

0.28 


This  gives  the  probabilities  that  our  walker  ends  up  in  locations  1,  2,  and  3.  For  this  example  we  are 
interested  in  location  3,  with  a  probability  on  0.28.  4k 

Returning  to  the  context  of  migration,  suppose  we  wish  to  know  how  many  residents  will  be  in  a 
certain  location  after  a  very  long  time.  It  turns  out  that  if  some  power  of  the  migration  matrix  has  all 
positive  entries,  then  there  is  a  vector  Xs  such  that  AnX. o  approaches  Xs  as  n  becomes  very  large.  Hence  as 
more  time  passes  and  n  increases,  AnX o  will  become  closer  to  the  vector  Xs. 

Consider  Theorem  7.33.  Let  n  increase  so  that  X„  approaches  Xs.  As  X„  becomes  closer  to  Xs,  so  too 
does  Xn+\.  For  sufficiently  large  n,  the  statement  Xn+  \  —  AXn  can  be  written  as  Xs  —  AXS. 

This  discussion  motivates  the  following  theorem. 


Theorem  7.36:  Steady  State  Vector 


Let  A  be  a  migration  matrix.  Then  there  exists  a  steady  state  vector  written  Xs  such  that 

XS=AXS 

where  Xs  has  positive  entries  which  have  the  same  sum  as  the  entries  ofX o. 

As  n  increases,  the  state  vectors  X„  will  approach  Xs. 


Note  that  the  condition  in  Theorem  7.36  can  be  written  as  (I  —  A)XS  —  0,  representing  a  homogeneous 
system  of  equations. 

Consider  the  following  example.  Notice  that  it  is  the  same  example  as  the  Example  7.34  but  here  it 
will  involve  a  longer  time  frame. 
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Example  7.37:  Populations  over  the  Long  Run 


Consider  the  migration  matrix 


A 


.6  0  .1 

.2  .8  0 

.2  .2  .9 


for  locations  1 , 2,  and  3.  Suppose  initially  there  are  1 00  residents  in  location  1,  200  in  location  2  and 
400  in  location  4.  Find  the  population  in  the  three  locations  after  a  long  time. 


Solution.  By  Theorem  7.36  the  steady  state  vector  Xp  can  be  found  by  solving  the  system  (I  —  A)XS  =  0. 
Thus  we  need  to  find  a  solution  to 


1 

0 

0  ' 

'  .6  0  .1  ' 

\ 

Xls 

'  0  ' 

0 

1 

0 

— 

.2  .8  0 

*2  s 

— 

0 

0 

0 

1 

.2  .2  .9 

/ 

.  X3*  . 

0 

The  augmented  matrix  and  the  resulting  reduced  row-echelon  form  are  given  by 


0.4 

0 

-0.1 

0  ' 

"  1 

0 

-0.25 

0  ' 

-0.2 

0.2 

0 

0 

-A  ■ 

■  -A 

0 

1 

-0.25 

0 

-0.2 

-0.2 

0.1 

0 

0 

0 

0 

0 

Therefore,  the  eigenvectors  are 

"  0.25  ' 
t  0.25 
1 

The  initial  vector  Xq  is  given  by 

'  100  ' 

200 

.  400  . 

Now  all  that  remains  is  to  choose  the  value  of  t  such  that 

0.25f  +  0.25 1  + 1  =  100  +  200  +  400 

Solving  this  equation  for  t  yields  t  —  Therefore  the  population  in  the  long  run  is  given  by 


"  0.25  ' 

'  116.6666666666667  ' 

0.25 

= 

116.6666666666667 

1 

466.6666666666667 

Again,  because  we  are  working  with  populations,  these  values  need  to  be  rounded.  The  steady  state 
vector  Xs  is  given  by 

117 

117 

466 


* 
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We  can  see  that  the  numbers  we  calculated  in  Example  7.34  for  the  populations  after  the  10th  unit  of 
time  are  not  far  from  the  long  term  values. 

Consider  another  example. 


Solution.  In  order  to  compare  the  populations  in  the  long  term,  we  want  to  find  the  steady  state  vector  Xs. 
Solve 


/ 

"  1  1  1  " 

5  2  5 

\ 

1 

o 

o 

1  1  1 

*Ls 

"  0  ' 

0  1  0 

— 

4  4  2 

X2s 

— 

0 

0  0  1 

11  1  3 

*3  s 

0 

20  4  10 

V 

J 

The  augmented  matrix  and  the  resulting  reduced  row-echelon  form  are  given  by 


4  _  1 

5  2 


l 

4 


ii  _I 

20  4 


'  1 

0 

16 

19 

0  " 

-*  • 

•  ->■ 

0 

1 

18 

19 

0 

_  0 

0 

0 

0  _ 

and  so  an  eigenvector  is 

"  16  ' 

18 

.  19  . 

Therefore,  the  proportion  of  population  in  location  2  to  location  1  is  given  by  j|.  The  proportion  of 
population  3  to  location  2  is  given  by  || .  4 
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7.3.3.I.  Eigenvalues  of  Markov  Matrices 


The  following  is  an  important  proposition. 


Proposition  7.39:  Eigenvalues  of  a  Migration  Matrix 


Let  A  —  \ciij ]  be  a  migration  matrix.  Then  1  is  always  an  eigenvalue  for  A. 


Proof.  Remember  that  the  determinant  of  a  matrix  always  equals  that  of  its  transpose.  Therefore, 

det(x7  —  A)  —  det  ((xl—  A)7^  =  det  ( xI  —  Ar ) 


because  IT  =  1.  Thus  the  characteristic  equation  for  A  is  the  same  as  the  characteristic  equation  for  Ar . 
Consequently,  A  and  Ar  have  the  same  eigenvalues.  We  will  show  that  1  is  an  eigenvalue  for  A1  and  then 
it  will  follow  that  1  is  an  eigenvalue  for  A. 

Remember  that  for  a  migration  matrix,  =  1.  Therefore,  if  A1  =  [h;/]  with  /?(/  =  ciji,  it  follows 

that 

12^*7  =  12  a  J'  ~  ^ 

j  j 

Therefore,  from  matrix  multiplication, 


'  1 ' 

1 

M 

_ 1 

"  1  " 

1 

1 

1 

Notice  that  this  shows  that 


is  an  eigenvector  for  A7  corresponding  to  the  eigenvalue,  A  =  1.  As 


explained  above,  this  shows  that  A  =  1  is  an  eigenvalue  for  A  because  A  and  A  7  have  the  same  eigenvalues. 


7.3.4.  Dynamical  Systems 


The  migration  matrices  discussed  above  give  an  example  of  a  discrete  dynamical  system.  We  call  them 
discrete  because  they  involve  discrete  values  taken  at  a  sequence  of  points  rather  than  on  a  continuous 
interval  of  time. 

An  example  of  a  situation  which  can  be  studied  in  this  way  is  a  predator  prey  model.  Consider  the 
following  model  where  x  is  the  number  of  prey  and  y  the  number  of  predators  in  a  certain  area  at  a  certain 
time.  These  are  functions  of  n  G  N  where  n  =  1,2,  •  •  •  are  the  ends  of  intervals  of  time  which  may  be  of 
interest  in  the  problem.  In  other  words,  x{n)  is  the  number  of  prey  at  the  end  of  the  nth  interval  of  time. 
An  example  of  this  situation  may  be  modeled  by  the  following  equation 


x(n  + 1) 

'  2  -3  ' 

x(n) 

.  y  (n  + 1)  _ 

1  4 

.  y  (n ) . 
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This  says  that  from  time  period  n  to  n  +  1,  x  increases  if  there  are  more  x  and  decreases  as  there  are  more 
y.  In  the  context  of  this  example,  this  means  that  as  the  number  of  predators  increases,  the  number  of  prey 
decreases.  As  for  y,  it  increases  if  there  are  more  y  and  also  if  there  are  more  x. 

This  is  an  example  of  a  matrix  recurrence  which  we  define  now. 


Definition  7.40:  Matrix  Recurrence 

Suppose  a  dynamical  system  is  given  by 

Xn+l  =  ClXn  +  b 

yn+ 1  =  cxn+d: 

This  system  can  be  expressed  as  V„+i  =  AVn  where  Vn  — 

5*? 

1  1 

and  A  = 

a  b 
c  d 

In  this  section,  we  will  examine  how  to  find  solutions  to  a  dynamical  system  given  certain  initial 
conditions.  This  process  involves  several  concepts  previously  studied,  including  matrix  diagonalization 
and  Markov  matrices.  The  procedure  is  given  as  follows.  Recall  that  when  diagonalized,  we  can  write 
An  =  PDnP  1 . 


We  will  now  consider  an  example  in  detail. 
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Example  7.42:  Solutions  of  a  Discrete  Dynamical  System 


Suppose  a  dynamical  system  is  given  by 

Xn+I  =  1.5xn  —  0.5yn 
yn+ 1  =  l  -0xn 

Express  this  system  as  a  matrix  recurrence  and  find  solutions  to  the  dynamical  system  for  initial 
conditions  xq  —  20,  yo  =  10. 


Solution.  First,  we  express  the  system  as  a  matrix  recurrence. 


Vn+]  =  AVn 


x(n  + 1) 

.  y  (n  + 1)  . 

x(n ) 
y(n) 


Then 


A 


1.5  -0.5 
1.0  0 


You  can  verify  that  the  eigenvalues  of  A  are  1  and  .5.  By  diagonalizing,  we  can  write  A  in  the  form 


P  1  DP  = 


'  1  1  ' 

'  1 

0  ' 

2  -1  ' 

1  2 

0 

.5 

-1  1 

Now  given  an  initial  condition 


the  solution  to  the  dynamical  system  is  given  by 

Vn  =  PD"P- W0 


x(n) 

'  i  r 

'  1  O' 

n  - 

2  -1  1 

xo 

.  yW . 

1  2 

0  .5 

-l  1  J 

.  yo . 

'  i  r 

'  1 

o ' 

2  -1 

r  xo 

1  2 

_0  (.5) 

n 

-1  1 

L 

>*o((-5)"  —  1)  — *o((-5)"  —  2) 
yo(2(.5r-l)-x0(2(.5r-2) 


If  we  let  n  become  arbitrarily  large,  this  vector  approaches 


2xq  —  yo 
2x0-y0 


x(n) 

1 

O 

1 

.  yW . 

o 

1 

_ 1 

Thus  for  large  n. 
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Now  suppose  the  initial  condition  is  given  by 


x0 

'  20  ' 

_  ?0  . 

10 

Then,  we  can  find  solutions  for  various  values  of  n. 


Here  are  the  solutions  for  values  of  n  between  1 


and  5 


n  —  1  : 


25.0 

20.0 


,n  =  2 


27.5 

25.0 


,  n  —  3  : 


28.75 

27.5 


'  29.375  ' 

'  29.688  ' 

n  =  4  : 

28.75 

,n  =  5  : 

29.375 

Notice  that  as  n  increases,  we  approach  the  vector  given  by 


2x0 -y0 

'  2(20) -10  ' 

"  30  ' 

2x0  ~yo  _ 

2  (20)  -10 

30 

These  solutions  are  graphed  in  the  following  figure. 


29 

28 

27 


28  29  30 


* 

The  following  example  demonstrates  another  system  which  exhibits  some  interesting  behavior.  When 
we  graph  the  solutions,  it  is  possible  for  the  ordered  pairs  to  spiral  around  the  origin. 


Example  7.43:  Finding  Solutions  to  a  Dynamical  System 


Suppose  a  dynamical  system  is  of  the  form 


x(n  + 1) 

1 

o 

o 

_ i 

x{n) 

y(n  +  1) 

l 

! 

o 

o 

1 _ 

y  0) 

Find  solutions  to  the  dynamical  system  for  given  initial  conditions. 


Solution.  Let 


0.7  0.7 
-0.7  0.7 


To  find  solutions,  we  must  diagonalize  A.  You  can  verify  that  the  eigenvalues  of  A  are  complex  and  are 
given  by  X\  =  .7  +  .7 i  and  Xi  =  .7  —  .7 i.  The  eigenvector  for  X\  =  .7  +  .7/  is 


1 


i 
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and  that  the  eigenvector  for  'ki  —  7  —  ,7/  is 

1 

— / 


Thus  the  matrix  A  can  be  written  in  the  form 


and  so, 


1  1 

i  —i 


7  +  .li 

0 


0 

.7  -  .li 


Vn  =  PDnP  lV0 


x{ri) 

'  i  i ' 

.  y(n) . 

/  —  / 

(7  +  7/)” 
0 


0 

(7  -  7/)” 


x0 

yo 


The  explicit  solution  is  given  by 

"  x0(i  (07  -  07/)”  +  \  (07  +  07/)”)  +y0  Qi  (07  -  07/)”  -  \i  (07  +  07/)”) 
yo  G  (°-7  “  °-70"  + 1  (°-7  +  °-70”)  -+)  Qi  (07  -  07/)”  -  |/  (07  +  07/)”) 


Suppose  the  initial  condition  is 


+) 

'  10  ' 

.  >’o  . 

10 

Then  one  obtains  the  following  sequence  of  values  which  are  graphed  below  by  letting  n  =  1,2,  •  •  •  ,20 


H 


to 


I ',-10 


w 


c10 


V\\ 

w\ 

w\ 

>  I  > 


Iff 


to  II, 
/// 
/// 
sW 


In  this  picture,  the  dots  are  the  values  and  the  dashed  line  is  to  help  to  picture  what  is  happening. 

These  points  are  getting  gradually  closer  to  the  origin,  but  they  are  circling  the  origin  in  the  clockwise 

x{n) 
y(n) 


direction  as  they  do  so.  As  n  increases,  the  vector 


approaches 


0 

0 


This  type  of  behavior  along  with  complex  eigenvalues  is  typical  of  the  deviations  from  an  equilibrium 
point  in  the  Lotka  Volterra  system  of  differential  equations  which  is  a  famous  model  for  predator-prey 
interactions.  These  differential  equations  are  given  by 


x  —  x(a  —  by) 
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y  =  —  y  (c  —  dx) 


where  a,b,c,d  are  positive  constants.  For  example,  you  might  have  X  be  the  population  of  moose  and  Y 
the  population  of  wolves  on  an  island. 

Note  that  these  equations  make  logical  sense.  The  top  says  that  the  rate  at  which  the  moose  population 
increases  would  be  aX  if  there  were  no  predators  Y .  However,  this  is  modified  by  multiplying  instead 
by  ( a  —  bY )  because  if  there  are  predators,  these  will  militate  against  the  population  of  moose.  The  more 
predators  there  are,  the  more  pronounced  is  this  effect.  As  to  the  predator  equation,  you  can  see  that  the 
equations  predict  that  if  there  are  many  prey  around,  then  the  rate  of  growth  of  the  predators  would  seem 
to  be  high.  However,  this  is  modified  by  the  term  —cY  because  if  there  are  many  predators,  there  would 
be  competition  for  the  available  food  supply  and  this  would  tend  to  decrease  Y' . 

The  behavior  near  an  equilibrium  point,  which  is  a  point  where  the  right  side  of  the  differential  equa¬ 
tions  equals  zero,  is  of  great  interest.  In  this  case,  the  equilibrium  point  is 


x  — 


a 

b 


Then  one  defines  new  variables  according  to  the  formula 


c  a 

x+7,=x-y=y+b 

In  terms  of  these  new  variables,  the  differential  equations  become 


/ 

x 

/ 

y 

Multiplying  out  the  right  sides  yields 


(a-b(y  +  f)) 

j(c_rf(,+  £)) 


x  =  —  bxy  —  b—y 
d 

y  —  dxy-\--dx 
b 

The  interest  is  for  x,y  small  and  so  these  equations  are  essentially  equal  to 

/  ,  C  ,  a  r 

x  —  —b—y,  v  =  -dx 
d  '  b 


Replace  x'  with  the  difference  quotient  -h/+/0  -90  w[lcrc  /7  js  a  small  positive  number  and  /  with  a 
similar  difference  quotient.  For  example  one  could  have  h  correspond  to  one  day  or  even  one  hour.  Thus, 
for  h  small  enough,  the  following  would  seem  to  be  a  good  approximation  to  the  differential  equations. 

c 

x(t  +  h )  =  x(t)—hb—y 

y(t  +  h )  =  y(t)  +  h-dx 
b 

Let  1,2,3,  ••  •  denote  the  ends  of  discrete  intervals  of  time  having  length  h  chosen  above.  Then  the  above 
equations  take  the  form 

x(n  + 1) 
y(n  +  l) 


1 

_ hbc 

/  \ 

d 

x[n) 

had 

b 

l 

.  y(n) . 
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Note  that  the  eigenvalues  of  this  matrix  are  always  complex. 

We  are  not  interested  in  time  intervals  of  length  h  for  h  very  small.  Instead,  we  are  interested  in  much 
longer  lengths  of  time.  Thus,  replacing  the  time  interval  with  mh, 

x  (n  +  m) 
y(n  +  m ) 

For  example,  if  m  =  2,  you  would  have 

1 


Note  that  most  of  the  time,  the  eigenvalues  of  the  new  matrix  will  be  complex. 

You  can  also  notice  that  the  upper  right  corner  will  be  negative  by  considering  higher  powers  of  the 
matrix.  Thus  letting  1, 2, 3,  •  •  •  denote  the  ends  of  discrete  intervals  of  time,  the  desired  discrete  dynamical 
system  is  of  the  form 


x(n  + 1) 

a 

-b  ' 

x(n) 

.  y  (n  + 1)  _ 

c 

d 

.  y  (n) . 

where  a,b,c,d  are  positive  constants  and  the  matrix  will  likely  have  complex  eigenvalues  because  it  is  a 
power  of  a  matrix  which  has  complex  eigenvalues. 

You  can  see  from  the  above  discussion  that  if  the  eigenvalues  of  the  matrix  used  to  define  the  dynamical 
system  are  less  than  1  in  absolute  value,  then  the  origin  is  stable  in  the  sense  that  as  n  — *  °°,  the  solution 
converges  to  the  origin.  If  either  eigenvalue  is  larger  than  1  in  absolute  value,  then  the  solutions  to  the 
dynamical  system  will  usually  be  unbounded,  unless  the  initial  condition  is  chosen  very  carefully.  The 
next  example  exhibits  the  case  where  one  eigenvalue  is  larger  than  1  and  the  other  is  smaller  than  1. 

The  following  example  demonstrates  a  familiar  concept  as  a  dynamical  system. 


x(n  +  2) 

y(n  +  2) 


1 

_ hbc 

m 

r 

d 

x(n ) 

had 

b 

l 

.  y(n) . 

ach 2  —2  b^h 


2fdh 


1  —  ach 2 


x(n ) 
y(n) 


Solution.  This  sequence  is  extremely  important  in  the  study  of  reproducing  rabbits.  It  can  be  considered 
as  a  dynamical  system  as  follows.  Let  y(n)  =  x  (n  +  1) .  Then  the  above  recurrence  relation  can  be  written 


x(n+  1) 

0  1 

x  (n) 

x(0) 

l 

y{n+l) 

1  1 

y(n) 

9 

y(0) 

i 

as 
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The  eigenvalues  of  the  matrix  A  are  X\  —  \  —  ^a/5  and  X2  —  jV5  +  The  corresponding  eigenvectors 
are,  respectively, 


'-iv/5-r 

Y,  = 

1 

,x2  = 

1 

You  can  see  from  a  short  computation  that  one  of  the  eigenvalues  is  smaller  than  1  in  absolute  value 
while  the  other  is  larger  than  1  in  absolute  value.  Now,  diagonalizing  A  gives  us 


2V5  2  2^/5  2 

-1 

or 

2V5  2  2^5  2 

1  1 

1  1 

1  1 

1^5+1  0 

0  i-iv? 


Then  it  follows  that  for  a  given  initial  condition,  the  solution  to  this  dynamical  system  is  of  the  form 


x(n) 

2V5  2  2 2 

'  (1V5  +  0" 

0 

.  y(n) . 

1  1 

0 

(1-1V5)". 

5  10^+  2  j-  1  - 

-N5  [1. 


It  follows  that 


* 


Here  is  a  picture  of  the  ordered  pairs  (x(n)  ,y(n))  for  n  =  0, 1,-  •  •  ,n. 


40] 


20 


0  10  20  30 


There  is  so  much  more  that  can  be  said  about  dynamical  systems.  It  is  a  major  topic  of  study  in 
differential  equations  and  what  is  given  above  is  just  an  introduction. 
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7.3.5.  The  Matrix  Exponential 


The  goal  of  this  section  is  to  use  the  concept  of  the  matrix  exponential  to  solve  first  order  linear  differential 
equations.  We  begin  by  proving  the  matrix  exponential. 

Suppose  A  is  a  diagonalizable  matrix.  Then  the  matrix  exponential,  written  eA,  can  be  easily  defined. 
Recall  that  if  D  is  a  diagonal  matrix,  then 


D  is  of  the  form 


P  lAP  =  D 
A]  0 


0 


(7.5) 


and  it  follows  that 


D'n  = 


A{" 


Since  A  is  diagonalizable, 

and 

Recall  why  this  is  true, 
and  so 


0 

A  =  POP 


0 


7  m 

A n 


m  r> 1 


A  =  PD  P 


A  =  PDP 


m  times 


Am  =  PDP  1  PDP-]  PDP  1  •  ■  ■  PDP~ 1 
=  PDmP  1 


We  now  will  examine  what  is  meant  by  the  matrix  exponental  eA.  Begin  by  formally  writing  the 
following  power  series  for  eA: 


,  ~  Ak  “  PDkP  1 

k= 0  ■  k= 0  K  • 

If  D  is  given  above  in  7.5,  the  above  sum  is  of  the  form 


'I 


<k= 0 


(. 

1  1  k 
k\^  1 

0 

\ 

p 

E 

k= 0 

0 

1 

"IS 

/ 

This  can  be  rearranged  as  follows: 


eA  =  P 


L°°  1  1  k 

k= 0  k\  A1 

0 


V00  1  it 

Lk=0  k\  An 
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=  P 


eK 


This  justifies  the  following  theorem. 


r  1 

Example  7.46:  Compute  for  a  Matrix  A 

Let 

A  = 

Find  eA . 

2  -1  -1  ' 

1  2  1 

-1  1  2 

Solution.  The  eigenvalues  work  out  to  be  1, 2, 3  and  eigenvectors  associated  with  these  eigenvalues  are 


0  ' 

"  -1  ' 

"  -1  ' 

-1 

**  1, 

-1 

^  2, 

0 

1 

1 

1 

Then  let 


and  so 


'  1 

0 

0  ' 

0 

-1 

-1 ' 

D  = 

0 

2 

0 

,P  = 

-1 

-1 

0 

0 

0 

3 

1 

1 

1 

P  1 


1  0  1 

-1  -1  -1 

0  1  1 


Then  the  matrix  exponential  is 


0 

-1 

-1  ' 

el 

0 

0  ' 

1 

0 

1  ' 

-1 

-1 

0 

0 

e2 

0 

-1 

-1 

-1 

1 

1 

1 

_  0 

0 

e3  _ 

0 

1 

1 

r  e2 

e2 

—  e 3 

c2- 

- 

?3  ' 

c2- 

e 

e2 

e2 

— 

e 

-e2  +  e 


-e2  +  e 3 


-e2  +  e  +  e 3 
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The  matrix  exponential  is  a  useful  tool  to  solve  autonomous  systems  of  first  order  linear  differential 
equations.  These  are  equations  which  are  of  the  form 

X'=AX,X(0)  =  C 

where  A  is  a  diagonalizable  n  x  n  matrix  and  C  is  a  constant  vector.  X  is  a  vector  of  functions  in  one 
variable,  t: 

x\  (t) 


X=X(t)  = 


x2(t) 


x„(t) 


Then  X'  refers  to  the  first  derivative  of  X  and  is  given  by 


X'  =  X'{t)  = 


4(0 

4(0 

4(0 


,  x\(t)  =  the  derivative  of  Xi(t) 


Then  it  turns  out  that  the  solution  to  the  above  system  of  equations  is  X  (t)  —  e^C.  To  see  this,  suppose 
A  is  diagonalizable  so  that 

At 


A  =  P 


Then 


^  =  P 


eAtC  =  P 


Ai  t 


Ait 


oKt 


P  lC 


Differentiating  eAlC  yields 


X'  =  f  cA?C  )  -  P 


Aie 


X\t 


P  lC 


Ai 

■  exx  t 

An 

gXnt 

P  lc 
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A] 

p-ip 

’  eXlt 

An 

=  AX 

Therefore  X  =  X(t)  =  C  is  a  solution  to  X'  —  AX. 
To  prove  that  X(0)  —  C  if  A  (t)  =  e^C: 


X(0 )  =  eA0C  =  P 


1 


P_1C 


1 


c 


Solution.  The  matrix  is  diagonalizable  and  can  be  written  as 

A  =  PDP  1 


'  0 

-2  ' 

1 

1  ' 

'  1 

0  ' 

2 

2  ' 

1 

3 

1 

2 

-1 

0 

2 

-1 

-2 

Therefore,  the  matrix  exponential  is  of  the  form 


1  1  ' 

V  0 

2  2  ' 

_4  -i. 

° 

-1  -2 

The  solution  to  the  initial  value  problem  is 

X{t)  =  ^ C 


x{t) 

i  i  ■ 

"  e*  0 

2  2  ' 

'  1  ' 

Uo  J 

.4  -i. 

0  c2/ 

-1  -2 

1 

4ef  -  3e2t 
3e2t  -  2el 


’  *(0)  ’ 

[y(0)  \ 

4e°  —  3e2(°)  ' 
3em  -  2e° 


We  can  check  that  this  works: 
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Lastly, 


X'  = 


4er  —  3e2t 

1 

l 

<N 

so 

1 

3e2t  -  2er 

6e2t  —  2el 

and 

AX  = 

which  is  the  same  thing.  Thus  this  is  the  solution  to  the  initial  value  problem 


- 1 

o 

-2  ' 

i 

4^ 

f 

u > 

to 

'  4ef  -  6e2t  ' 

1 

U> 

_  3e2t-2ef  _ 

<N 

1 

<N 

so 

_ 1 

Exercises 


Exercise  7.3.1  Let  A 


1  2 
2  1 


Diagonalize  A  to  find  A10. 


Exercise  7.3.2  Let  A  = 


1  4  1 
0  2  5 
0  0  5 


Diagonalize  A  to  find  A50. 


Exercise  7.3.3  Let  A  = 


1  -2  -1 
2  -1  1 
-2  3  1 


Diagonalize  A  to  find  A10®. 


Exercise  7.3.4  The  following  is  a  Markov  (migration)  matrix  for  three  locations 

'  i_  l  l  " 

10  9  5 

J_  7  2 

10  9  5 

112 
5  9  5 


(a)  Initially,  there  are  90  people  in  location  1,  81  in  location  2,  and  85  in  location  3.  How  many  are  in 
each  location  after  one  time  period? 

(b)  The  total  number  of  individuals  in  the  migration  process  is  256.  After  a  long  time,  how  many  are  in 
each  location? 


Exercise  7.3.5  The  following  is  a  Markov  (migration)  matrix  for  three  locations 

"112" 

5  5  5 

2  2  1 
5  5  5 

2  2  2 
5  5  5 
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(a)  Initially,  there  are  130  individuals  in  location  1,  300  in  location  2,  and  70  in  location  3.  How  many 
are  in  each  location  after  two  time  periods? 

(b)  The  total  number  of  individuals  in  the  migration  process  is  500.  After  a  long  time,  how  many  are  in 
each  location? 

Exercise  7.3.6  The  following  is  a  Markov  (migration)  matrix  for  three  locations 

_3_  3  1 

10  8  3 

J_  3  1 

10  8  3 

3  11 

5  4  3 

The  total  number  of  individucds  in  the  migration  process  is  480.  After  a  long  time,  how  many  are  in  each 
location  ? 

Exercise  7.3.7  The  following  is  a  Markov  (migration)  matrix  for  three  locations 

3_  i  I 
10  3  5 

J_  l  !_ 

10  3  10 

2  i  J_ 

5  3  10 

The  total  number  of  individuals  in  the  migration  process  is  1155.  After  a  long  time,  how  many  are  in  each 
location  ? 

Exercise  7.3.8  The  following  is  a  Markov  (migration)  matrix  for  three  locations 

2  J_  I 
5  10  8 

3_  2  5 

10  5  8 

_3_  I  i 
10  2  4 

The  total  number  of  individuals  in  the  migration  process  is  704.  After  a  long  time,  how  many  are  in  each 
location  ? 

Exercise  7.3.9  A  person  sets  off  on  a  random  walk  with  three  possible  locations.  The  Markov  matrix  of 
probabilities  A  =  [a,y]  is  given  by 

["  0.1  0.3  0.7  " 

0.1  0.3  0.2 
_  0.8  0.4  0.1  _ 

If  the  wcdker  starts  in  location  2,  what  is  the  probability  of  ending  back  in  location  2  at  time  n  =  3? 
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Exercise  7.3.10  A  person  sets  off  on  a  random  walk  with  three  possible  locations.  The  Markov  matrix  of 
probabilities  A  =  [a,y]  is  given  by 


0.5 

0.1 

0.6 

0.2 

0.9 

0.2 

0.3 

0 

0.2 

It  is  unknown  where  the  walker  starts,  but  the  probability  of  starting  in  each  location  is  given  by 


Xo  = 


0.2 

0.25 

0.55 


What  is  the  probability  of  the  walker  being  in  location  1  at  time  n  —  2? 


Exercise  7.3.11  You  own  a  trailer  rental  company  in  a  large  city  and  you  have  four  locations,  one  in 
the  South  East,  one  in  the  North  East,  one  in  the  North  West,  and  one  in  the  South  West.  Denote  these 
locations  by  SE,NE,NW,  and  SW  respectively.  Suppose  that  the  following  table  is  obserx’ed  to  take  place. 


SE 

NE 

AW 

SW 

SE 

l 

3 

l 

10 

l 

10 

l 

5 

NE 

1 

3 

7 

10 

1 

5 

1 

10 

NW 

2 

9 

1 

10 

3 

5 

1 

5 

SW 

1 

9 

1 

10 

1 

10 

1 

2 

In  this  table,  the  probability  that  a  trailer  starting  at  NE  ends  in  NW  is  1/10,  the  probability  that  a  trailer 
starting  at  SW  ends  in  NW  is  1/5,  and  so  forth.  Approximately  how  many  will  you  have  in  each  location 
after  a  long  time  if  the  total  number  of  trailers  is  413? 


Exercise  7.3.12  You  own  a  trailer  rental  company  in  a  large  city  and  you  have  four  locations,  one  in 
the  South  East,  one  in  the  North  East,  one  in  the  North  West,  and  one  in  the  South  West.  Denote  these 
locations  by  SE,NE,NW,  and  SW  respectively.  Suppose  that  the  following  table  is  obserx’ed  to  take  place. 


SE 

NE 

NW 

SW 

SE 

l 

7 

l 

4 

1 

10 

l 

5 

NE 

2 

7 

1 

4 

1 

5 

1 

10 

NW 

1 

7 

1 

4 

3 

5 

1 

5 

SW 

3 

7 

1 

4 

1 

10 

1 

2 

In  this  table,  the  probability  that  a  trailer  starting  at  NE  ends  in  NW  is  1  / 10,  the  probability  that  a  trailer 
starting  at  SW  ends  in  NW  is  1/5,  and  so  forth.  Approximately  how  many  will  you  have  in  each  location 
after  a  long  time  if  the  total  number  of  trailers  is  1469. 

Exercise  7.3.13  The  following  table  describes  the  transition  probabilities  between  the  states  rainy,  partly 
cloudy  and  sunny.  The  symbol  p.c.  indicates  partly  cloudy.  Thus  if  it  starts  off  p.c.  it  ends  up  sunny  the 
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1  9 

next  day  with  probability  If  it  starts  off  sunny,  it  ends  up  sunny  the  next  day  with  probability  and  so 
forth. 


rams 

l 

5 

sunny 

l 

5 

rains 

sunny 

1 

5 

2 

5 

3 

2 

p.c. 

5 

5 

Given  this  information,  what  are  the  probabilities  that  a  given  day  is  rainy,  sunny,  or  partly  cloudy? 


Exercise  7.3.14  The  following  table  describes  the  transition  probabilities  between  the  states  rainy,  partly 
cloudy  and  sunny.  The  symbol  p.c.  indicates  partly  cloudy.  Thus  if  it  starts  off  p.c.  it  ends  up  sunny  the 
next  day  with  probability  jq.  If  it  starts  off  sunny,  it  ends  up  sunny  the  next  day  with  probability  |  and  so 
forth. 


rams 

l 

5 

sunny 

l 

5 

rcdns 

sunny 

1 

10 

2 

5 

7 

2 

p.c. 

10 

5 

Given  this  information,  what  are  the  probabilities  that  a  given  day  is  rainy,  sunny,  or  partly  cloudy? 


Exercise  7.3.15  You  own  a  trailer  rental  company  in  a  large  city  and  you  have  four  locations,  one  in 
the  South  East,  one  in  the  North  East,  one  in  the  North  West,  and  one  in  the  South  West.  Denote  these 
locations  by  SE,NE,NW,  and  SW  respectively.  Suppose  that  the  following  table  is  obserx’ed  to  take  place. 


SE 

NE 

NW 

SW 

SE 

5 

11 

l 

10 

l 

10 

l 

5 

NE 

1 

11 

7 

10 

1 

5 

1 

10 

NW 

2 

11 

1 

10 

3 

5 

1 

5 

SW 

3 

11 

1 

10 

1 

10 

1 

2 

In  this  table,  the  probability  that  a  trailer  starting  at  NE  ends  in  NW  is  1  / 10,  the  probability  that  a  trailer 
starting  at  SW  ends  in  NW  is  1/5,  and  so  forth.  Approximately  how  many  will  you  have  in  each  location 
after  a  long  time  if  the  total  number  of  trailers  is  407? 


Exercise  7.3.16  The  University  of  Poohbah  offers  three  degree  programs,  scouting  education  (SE),  dance 
appreciation  (DA),  and  engineering  (E).  It  has  been  determined  that  the  probabilities  of  transferring  from 
one  program  to  another  are  as  in  the  following  table. 


SE 

DA 

E 

SE 

.8 

.1 

.3 

DA 

.1 

.7 

.5 

E 

.1 

.2 

.2 
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where  the  number  indicates  the  probability  of  transferring  from  the  top  program  to  the  program  on  the 
left.  Thus  the  probability  of  going  from  DA  to  E  is  .2.  Find  the  probability  that  a  student  is  enrolled  in  the 
various  programs. 


Exercise  7.3.17  In  the  city  ofNabal,  there  are  three  political  persuasions,  republicans  (R),  democrats  (D), 
and  neither  one  (N).  The  following  table  shows  the  transition  probabilities  between  the  political  parties, 
the  top  row  being  the  initial  political  party  and  the  side  row  being  the  political  affiliation  the  following 
year. 

R  D  N 


R 

D 

N 


l 

5 

i 

5 

3 

5 


1 

6 

1 

3 

1 

2 


2 

7 

4 

7 

1 

7 


Find  the  probabilities  that  a  person  will  be  identified  with  the  various  political  persuasions.  Which  party 
will  end  up  being  most  important? 


Exercise  7.3.18  The  following  table  describes  the  transition  probabilities  between  the  states  rainy,  partly 
cloudy  and  sunny.  The  symbol  p.c.  indicates  partly  cloudy.  Thus  if  it  starts  off  p.c.  it  ends  up  sunny  the 
next  day  with  probability  i.  If  it  starts  off  sunny,  it  ends  up  sunny  the  next  day  with  probability  if  and  so 
forth. 


rams 

l 

5 

sunny 

mins 

2 

7 

sunny 

1 

5 

2 

7 

p.c. 

3 

3 

5 

7 

Given  this  information,  what  are  the  probabilities  that  a  given  day  is  rainy,  sunny,  or  partly  cloudy? 


Exercise  7.3.19  Find  the  solution  to  the  initial  value  problem 


X 

/ 

"  0 

-1  ' 

X 

_  y . 

6 

5 

.  y  _ 

*(0) 

y(0) 

= 

"  2  ' 
2 

Hint:  form  the  matrix  exponential  eAt  and  then  the  solution  is  eAtC  where  C  is  the  initial  vector. 
Exercise  7.3.20  Find  the  solution  to  the  initial  value  problem 


X 

/ 

"  -4 

•  -3  ' 

X 

_  y  _ 

6 

i  5 

_  y  _ 

jr(0) 

y(0) . 

= 

'  3  ' 
4 
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Hint:  form  the  matrix  exponential  eAt  and  then  the  solution  is  eAtC  where  C  is  the  initial  vector. 
Exercise  7.3.21  Find  the  solution  to  the  initial  value  problem 


X 

/ 

"  -1 

2  ' 

X 

_  y  _ 

-4 

5 

_  y  _ 

x(0) 

y(0) 

= 

"  2  ' 
2 

Hint:  form  the  matrix  exponential  eAt  and  then  the  solution  is  eAtC  where  C  is  the  initial  vector. 
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7.4.1.  Orthogonal  Diagonalization 


We  begin  this  section  by  recalling  some  important  definitions.  Recall  from  Definition  4.122  that  non-zero 
vectors  are  called  orthogonal  if  their  dot  product  equals  0.  A  set  is  orthonormal  if  it  is  orthogonal  and  each 
vector  is  a  unit  vector. 

An  orthogonal  matrix  U,  from  Definition  4. 129,  is  one  in  which  UU1  =  I.  In  other  words,  the  transpose 
of  an  orthogonal  matrix  is  equal  to  its  inverse.  A  key  characteristic  of  orthogonal  matrices,  which  will  be 
essential  in  this  section,  is  that  the  columns  of  an  orthogonal  matrix  form  an  orthonormal  set. 

We  now  recall  another  important  definition. 


Definition  7.48:  Symmetric  and  Skew  Symmetric  Matrices 


A  real  n  x  n  matrix  A,  is  symmetric  ifAr  =  A.  If  A  =  —A1,  then  A  is  called  skew  symmetric. 


Before  proving  an  essential  theorem,  we  first  examine  the  following  lemma  which  will  be  used  below. 


Proof.  This  result  follows  from  the  definition  of  the  dot  product  together  with  properties  of  matrix  multi¬ 
plication,  as  follows: 

Ax*y  = 

k,l 

k,l 
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=  x»ATy 
—  x»Ay 

The  last  step  follows  from  AT  =  A,  since  A  is  symmetric.  4k 

We  can  now  prove  that  the  eigenvalues  of  a  real  symmetric  matrix  are  real  numbers.  Consider  the 
following  important  theorem. 


Theorem  7.50:  Orthogonal  Eigenvectors 


Let  A  be  a  real  symmetric  matrix.  Then  the  eigenvalues  of  A  are  real  numbers  and  eigenvectors 
corresponding  to  distinct  eigenvalues  are  orthogonal. 


Proof.  Recall  that  for  a  complex  number  a  +  ib,  the  complex  conjugate,  denoted  by  a  +  ib  is  given  by 
a  +  ib  —  a  —  ib.  The  notation,  x  will  denote  the  vector  which  has  every  entry  replaced  by  its  complex 
conjugate. 

Suppose  A  is  a  real  symmetric  matrix  and  Ax  =  Ax.  Then 


~X~=T ->  .T->  AT  'iAT-, 

Ax  x—  [Ax)  x  —  x  A  x  —  x  Ax  —  Ax  x 


—T_>  — 

Dividing  by  x  x  on  both  sides  yields  A  —  A  which  says  A  is  real.  To  do  this,  we  need  to  ensure  that 
x  x  /  0.  Notice  that  x  x  —  0  if  and  only  if  x  —  0.  Since  we  chose  x  such  that  Ax  =  Ax,  x  is  an  eigenvector 
and  therefore  must  be  nonzero. 

Now  suppose  A  is  real  symmetric  and  Ax  =  Ax,  Ay  =  fly  where  jl  yf  A.  Then  since  A  is  symmetric,  it 
follows  from  Lemma  7.49  about  the  dot  product  that 


A  x»y—Ax»y  —  x»Ay  —  x»fiy  —  fix»y 

Hence  (A  —  fl)  x»y  =  0.  It  follows  that,  since  A  —  ft  ^  0,  it  must  be  thatx»y  =  0.  Therefore  the  eigenvectors 
form  an  orthogonal  set.  4k 

The  following  theorem  is  proved  in  a  similar  manner. 


Theorem  7.51:  Eigenvalues  of  Skew  Symmetric  Matrix 


The  eigenvalues  of  a  real  skew  symmetric  matrix  are  either  equal  to  0  or  are  pure  imaginary  num¬ 
bers. 


Proof.  First,  note  that  if  A  —  0  is  the  zero  matrix,  then  A  is  skew  symmetric  and  has  eigenvalues  equal  to 
0. 

Suppose  A  =  —A1  so  A  is  skew  symmetric  and  Ax  =  Ax.  Then 

-t  ^  (—,t_  r=yr  T-  AT  .  „  —,t_ 


Ax  x  —  (Ax)  x  —  x  Arx  =  —x  Ax  =  —Ax 


-r  _ 


and  so,  dividing  by  x  x  as  before,  A  =  —A .  Letting  A  —  a  +  ib,  this  means  a  —  ib—  —a  —  ib  and  so  a  =  0. 
Thus  A  is  pure  imaginary.  4 
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Consider  the  following  example. 


Example  7.52:  Eigenvalues  of  a  Skew  Symmetric  Matrix 

Let  A  = 

'  0  -1  ' 
1  0 

.  Find  its  eigenvalues. 

Solution.  First  notice  that  A  is  skew  symmetric.  By  Theorem  7.51,  the  eigenvalues  will  either  equal  0  or 
be  pure  imaginary.  The  eigenvalues  of  A  are  obtained  by  solving  the  usual  equation 


det(x7  —  A)  =  det 


x  1 
—  1  x 


=  .*+1=0 


Hence  the  eigenvalues  are  +/,  pure  imaginary.  4k 

Consider  the  following  example. 


Example  7.53:  Eigenvalues  of  a  Symmetric  Matrix 

Let  A  — 

'12' 

2  3 

.  Find  its  eigenvalues. 

Solution.  First,  notice  that  A  is  symmetric.  By  Theorem  7.50,  the  eigenvalues  will  all  be  real, 
eigenvalues  of  A  are  obtained  by  solving  the  usual  equation 


The 


det(.*7  —  A)  —  det 


1  -2 
-2  v-3 


—  xz  —  4x  —  1  =0 


The  eigenvalues  are  given  by  X\  —2  +  \/5  and  A 2  =  2  —  -y/5  which  are  both  real.  4k 

Recall  that  a  diagonal  matrix  D  —  \dt  j\  is  one  in  which  djj  =  0  whenever  i  ^  j.  In  other  words,  all 
numbers  not  on  the  main  diagonal  are  equal  to  zero. 

Consider  the  following  important  theorem. 


Theorem  7.54:  Orthogonal  Diagonalization 


Let  A  be  a  real  symmetric  matrix.  Then  there  exists  an  orthogonal  matrix  U  such  that 

UtAU  =  D 

where  D  is  a  diagonal  matrix.  Moreover,  the  diagonal  entries  of  D  are  the  eigenvalues  of  A. 


We  can  use  this  theorem  to  diagonalize  a  symmetric  matrix,  using  orthogonal  matrices.  Consider  the 
following  corollary. 
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Corollary  7.55:  Orthonormal  Set  of  Eigenvectors 


If  A  is  a  real  n  x  n  symmetric  matrix,  then  there  exists  an  orthonormal  set  of  eigenvectors, 

{li{,  •  •  • ,  uti\ . 


Proof.  Since  A  is  symmetric,  then  by  Theorem  7.54,  there  exists  an  orthogonal  matrix  U  such  that  UTAU  = 
D,  a  diagonal  matrix  whose  diagonal  entries  are  the  eigenvalues  of  A.  Therefore,  since  A  is  symmetric  and 
all  the  matrices  are  real, 


D  =  Dt  =  UTAT  U  =  UtAtU  =  UtAU  =  D 


showing  D  is  real  because  each  entry  of  D  equals  its  complex  conjugate. 
Now  let 

U  =  [  U\  U2  ■■  ■  Un  ] 
where  the  ul  denote  the  columns  of  U  and 


A, 


D  = 


0 


0 


K 


The  equation,  UT AU  —  D  implies  AC  —  UD  and 


AU  —  [  Au\  Auj  ■  ■  ■  Aun  ] 

=  [  A i u i  A2W2  •••  A nun  ] 
=  UD 


where  the  entries  denote  the  columns  of  AU  and  UD  respectively.  Therefore,  Aw,  =  A,w(.  Since  the  matrix 
U  is  orthogonal,  the  ijth  entry  of  UTU  equals  <5,7  and  so 

Sjj  —  uf  Uj  —  Uj  •  Uj 

This  proves  the  corollary  because  it  shows  the  vectors  {w,  }  form  an  orthonormal  set. 


Definition  7.56:  Principal  Axes 


Let  A  he  an  n  x  n  matrix.  Then  the  principal  axes  of  A  is  a  set  of  orthonormal  eigenvectors  of  A. 


In  the  next  example,  we  examine  how  to  find  such  a  set  of  orthonormal  eigenvectors. 


Example  7.57:  Find  an  Orthonormal  Set  of  Eigenvectors 

Find  an  orthonormal  set  of  eigenvectors  foi 

A  = 

"  the  symmetric  n 

17  -2  -2  ' 
-2  6  4 

-2  4  6 

latrix 
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Solution.  Recall  Procedure  7.5  for  finding  the  eigenvalues  and  eigenvectors  of  a  matrix.  You  can  verify 
that  the  eigenvalues  are  18,9,2.  First  find  the  eigenvector  for  18  by  solving  the  equation  (187  —  A)X  —  0. 
The  appropriate  augmented  matrix  is  given  by 


The  reduced  row-echelon  form  is 


Therefore  an  eigenvector  is 


1 

00 

1 

2 

2 

1 - 

o 

2 

18-6  -4 

0 

2 

-4 

18-6 

- 1 

o 

"  1 

0 

4 

0  ' 

0 

i  -l 

0 

0 

0 

0 

0 

"  -4  ' 

1 

1 


Next  find  the  eigenvector  for  A  —9.  The  augmented  matrix  and  resulting  reduced  row-echelon  form  are 


i 

1 

-J 

2  2 

0  ' 

'  1 

0 

1 

2 

0  " 

2 

9-6  -4 

0 

— >■  • 

•  — >■ 

0 

1 

-1 

0 

2 

-4  9-6 

0 

.  0 

0 

0 

0  . 

Thus  an  eigenvector  for  A  =  9  is 

'  1  " 

2 

2 

Finally  find  an  eigenvector  for  A  =  2.  The  appropriate  augmented  matrix  and  reduced  row-echelon  form  are 


"  2-17 

2 

2 

0  ' 

'  1 

0 

0 

0  ' 

2 

2-6  -4 

0 

->■  • 

•  ->• 

0 

1 

1 

0 

2 

-4 

2-6 

0 

0 

0 

0 

0 

Thus  an  eigenvector  for  A  =  2  is 

0 

-1 

1 

The  set  of  eigenvectors  for  A  is  given  by 


You  can  verify  that  these  eigenvectors  form  an  orthogonal  set.  By  dividing  each  eigenvector  by  its  magni¬ 
tude,  we  obtain  an  orthonormal  set: 


Vl8 


-4 

1 

1 


1 

’3 


1 

2 

2 


1 

’7! 


o 

-l 

i 


406  Spectral  Theory 


* 


Consider  the  following  example. 


r  i 

Example  7.58:  Repeated  Eigenvalues 

Find  an  orthonormal  set  of  three  eigenvectoi 

A  = 

rs  for  the  matrh 

'  10  2  2  ' 

2  13  4 

2  4  13 

k 

Solution.  You  can  verify  that  the  eigenvalues  of  A  are  9  (with  multiplicity  two)  and  18  (with  multiplicity 
one).  Consider  the  eigenvectors  corresponding  to  A  —9.  The  appropriate  augmented  matrix  and  reduced 
row-echelon  form  are  given  by 


i 

1 

O 

-2 

-2 

0  ' 

'  1 

2 

2 

0  ' 

-2 

9-13 

-4 

0 

-A  ■ 

■  ->■ 

0 

0 

0 

0 

-2 

-4 

9-13 

0 

0 

0 

0 

0 

and  so  eigenvectors  are  of  the  form 

—2y  -  2  z 

y 

z 

We  need  to  find  two  of  these  which  are  orthogonal.  Let  one  be  given  by  setting  z  —  0  and  y  =  1,  giving 
2  ' 

1  . 

0  _ 

In  order  to  find  an  eigenvector  orthogonal  to  this  one,  we  need  to  satisfy 


'  -2  ' 

—2 y  -  2  z 

1 

• 

y 

0 

z 

=  5y  +  4z  —  0 


The  values  y  —  —4  and  z  —  5  satisfy  this  equation,  giving  another  eigenvector  corresponding  to  A  =  9  as 


\  -2  (-4) -2  (5) 

"  -2  ' 

(-4) 

= 

-4 

5 

5 

Next  find  the  eigenvector  for  A  =  18.  The  augmented  matrix  and  the  resulting  reduced  row-echelon 
form  are  given  by 


- 1 

OO 

1 

o 

-2 

-2 

0  ' 

"  1 

0 

1 

2 

0  ' 

-2 

18-13 

-4 

0 

-A  ■ 

■  ->■ 

0 

1 

-1 

0 

-2 

-4 

18-13 

0 

.  0 

0 

0 

0  . 
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and  so  an  eigenvector  is 

"  1  " 

2 

2 

Dividing  each  eigenvector  by  its  length,  the  orthonormal  set  is 


* 

In  the  above  solution,  the  repeated  eigenvalue  implies  that  there  would  have  been  many  other  orthonor¬ 
mal  bases  which  could  have  been  obtained.  While  we  chose  to  take  z  =  0,y  —  1,  we  could  just  as  easily 
have  taken  y  =  0  or  even  y  —  z—  1 .  Any  such  change  would  have  resulted  in  a  different  orthonormal  set. 

Recall  the  following  definition. 


Definition  7.59:  Diagonalizable 


An  n  x  n  matrix  A  is  said  to  be  non  defective  or  diagonalizable  if  there  exists  an  invertible  matrix 
P  such  that  P  lAP  =  D  where  D  is  a  diagonal  matrix. 


As  indicated  in  Theorem  7.54  if  A  is  a  real  symmetric  matrix,  there  exists  an  orthogonal  matrix  U 
such  that  UtAU  =  D  where  D  is  a  diagonal  matrix.  Therefore,  every  symmetric  matrix  is  diagonalizable 
because  if  U  is  an  orthogonal  matrix,  it  is  invertible  and  its  inverse  is  UT .  In  this  case,  we  say  that  A  is 
orthogonally  diagonalizable.  Therefore  every  symmetric  matrix  is  in  fact  orthogonally  diagonalizable. 
The  next  theorem  provides  another  way  to  determine  if  a  matrix  is  orthogonally  diagonalizable. 


Theorem  7.60:  Orthogonally  Diagonalizable 


Let  A  be  an  n  x  n  matrix.  Then  A  is  orthogonally  diagonalizable  if  and  only  if  A  has  an  orthonormal 
set  of  eigenvectors. 


Recall  from  Corollary  7.55  that  every  symmetric  matrix  has  an  orthonormal  set  of  eigenvectors.  In  fact 
these  three  conditions  are  equivalent. 

In  the  following  example,  the  orthogonal  matrix  U  will  be  found  to  orthogonally  diagonalize  a  matrix. 


1 

Example  7.61:  Diagonalize  a  Symmetric  Matrix 

Let  A  — 

'10  0' 

0  2  i 

u  2  2 

0  i  3 
u  2  2 

.  Find  an  orthogonal  matrix  U  such  that  UT AU  is  a  diagonal  matrix. 

408  Spectral  Theory 


Solution.  In  this  case,  the  eigenvalues  are  2  (with  multiplicity  one)  and  1  (with  multiplicity  two).  First 
we  will  find  an  eigenvector  for  the  eigenvalue  2.  The  appropriate  augmented  matrix  and  resulting  reduced 
row-echelon  form  are  given  by 


0 

1 

1 

o  o 

'  l 

0 

0 

0  ' 

2 

->  • 

•  — >■ 

0 

1 

-1 

0 

2—3 
z  2 

0 

0 

0 

0 

0 

and  so  an  eigenvector  is 

"O' 

1 

1 

However,  it  is  desired  that  the  eigenvectors  be  unit  vectors  and  so  dividing  this  vector  by  its  length  gives 


0 

j_ 

72 

J_ 

72 


Next  find  the  eigenvectors  corresponding  to  the  eigenvalue  equal  to  1 .  The  appropriate  augmented  matrix 
and  resulting  reduced  row-echelon  form  are  given  by: 


1-1  0  0 

0  1 -  - 
u  1  2  2 

0  -4  i-l 


1 

O  O 

"  0 

1 

1 

0  " 

->■  • 

•  ->■ 

0 

0 

0 

0 

0 

0 

0 

0 

0 

Therefore,  the  eigenvectors  are  of  the  form 


Two  of  these  which  are  orthonormal  are 


'  1 ' 
0 

,  choosing  5=1  and  t  —  0,  and 

1 

0 

1 

72 

,  letting  s  —  0, 


t  —  1  and  normalizing  the  resulting  vector. 

To  obtain  the  desired  orthogonal  matrix,  we  let  the  orthonormal  eigenvectors  computed  above  be  the 
columns. 


0  1 


72 


0 

0 


0 

72 


To  verify,  compute  UTAU  as  follows: 


Ul  AU  — 


0  _J-  J- 
w  72  72 


1 


0  0 


0  J_  J_ 
u  72  72 


'  1 

0 

0  ' 

r 

0 

3 

2 

1 

2 

0 

1 

2 

3 

2 

- 

0  1 


f 

72 


0 

0 


0 

f 

72  . 
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1  0  0 
0  1  0 
0  0  2 


=  D 


the  desired  diagonal  matrix.  Notice  that  the  eigenvectors,  which  construct  the  columns  of  U,  are  in  the 
same  order  as  the  eigenvalues  in  D.  4k 


We  conclude  this  section  with  a  Theorem  that  generalizes  earlier  results. 


Theorem  7.62:  Triangulation  of  a  Matrix 


Let  A  be  an  nx  n  matrix.  If  A  has  n  real  eigenvalues,  then  an  orthogonal  matrix  U  can  be  found  to 
result  in  the  upper  triangular  matrix  UT AU . 


This  Theorem  provides  a  useful  Corollary. 


Corollary  7.63:  Determinant  and  Trace 


Let  A  be  an  n  x  n  matrix  with  eigenvalues  A] ,  •  •  •  ,Xn.  Then  it  follows  that  det(A)  is  equal  to  the 
product  of  the  X\,  while  trace(A )  is  equal  to  the  sum  of  the  A,. 


Proof.  By  Theorem  7.62,  there  exists  an  orthogonal  matrix  U  such  that  UT AU  —  P,  where  P  is  an  upper 
triangular  matrix.  Since  P  is  similar  to  A,  the  eigenvalues  of  P  are  Ai,  Xo, . . . ,  A n.  Furthermore,  since  P  is 
(upper)  triangular,  the  entries  on  the  main  diagonal  of  P  are  its  eigenvalues,  so  det(P)  =  X\  X2  ■  ■  ■  X„  and 

trace(P)  =  X\  +  A2  H - \-Xn.  Since  P  and  A  are  similar,  det(A)  =  det(P)  and  trace(A)  =  trace (P),  and 

therefore  the  results  follow.  4k 

7.4.2.  The  Singular  Value  Decomposition 


We  begin  this  section  with  an  important  definition. 


Definition  7.64:  Singular  Values 


Let  A  be  an  m  x  n  matrix.  The  singular  values  of  A  are  the  square  roots  of  the  positive  eigenvalues 
ofATA. 


Singular  Value  Decomposition  (SVD)  can  be  thought  of  as  a  generalization  of  orthogonal  diagonaliza- 
tion  of  a  symmetric  matrix  to  an  arbitrary  m  x  n  matrix.  This  decomposition  is  the  focus  of  this  section. 

The  following  is  a  useful  result  that  will  help  when  computing  the  SVD  of  matrices. 


Proposition  7.65: 


Let  A  be  an  m  x  n  matrix.  Then  A7  A  andAA7  have  the  same  nonzero  eigenvalues. 


Proof.  Suppose  A  is  an  m  x  n  matrix,  and  suppose  that  A  is  a  nonzero  eigenvalue  of  A7  A.  Then  there  exists 
a  nonzero  vector  X  e  R'1  such  that 
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{AtA)X  =  XX.  (7.6) 

Multiplying  both  sides  of  this  equation  by  A  yields: 

A(AtA)X  =  AXX 
(AAr)(AX)  =  A  (AX'). 

Since  A  ^  0  andX  =4  0„,  XX  ^  0„,  and  thus  by  equation  (7.6),  ( Ar A)X  ^  0m;  thus  AT (AX)  ^  0„„  implying 
that  AX  0,n . 

Therefore  AX  is  an  eigenvector  of  AAr  corresponding  to  eigenvalue  X .  An  analogous  argument  can  be 
used  to  show  that  every  nonzero  eigenvalue  of  AAT  is  an  eigenvalue  of  ATA,  thus  completing  the  proof. 

* 


Given  an  m  x  n  matrix  A,  we  will  see  how  to  express  A  as  a  product 

A  =  UXVT 


where 


•  U  is  an  m  x  m  orthogonal  matrix  whose  columns  are  eigenvectors  of  AA7  . 

•  V  is  an  n  x  n  orthogonal  matrix  whose  columns  are  eigenvectors  of  A  T  A . 

•  E  is  an  m  x  n  matrix  whose  only  nonzero  values  lie  on  its  main  diagonal,  and  are  the  singular  values 
of  A. 

How  can  we  find  such  a  decomposition?  We  are  aiming  to  decompose  A  in  the  following  form: 


A  —  U 


o  0 
0  0 


where  <7  is  of  the  form 


Thus  A1  —V 


a  = 


o\ 

0 


a  0 
0  0 


UT  and  it  follows  that 


0 

ok 


ArA  =  V 


a  0 
0  0 


UTU 


G  0 
0  0 


(j  0 

0  0 


and  so  A1  AV  =  V 


a2  0 
0  0 


.  Similarly,  AArU  =  U 


G~  0 

0  0 


.  Therefore,  you  would  find  an  orthonormal 


basis  of  eigenvectors  for  AAr  make  them  the  columns  of  a  matrix  such  that  the  corresponding  eigenvalues 
are  decreasing.  This  gives  U.  You  could  then  do  the  same  for  A7 A  to  get  V. 

We  formalize  this  discussion  in  the  following  theorem. 
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Proof.  There  exists  an  orthonormal  basis,  {v;}”=1  such  that  ArAvj  =  of  v,  where  a?  >  0  for  z  =  1,  ■  •  • ,  (<7;  >  0) 

and  equals  zero  if  i  >  k.  Thus  for  i  >  k,  Av;  =  0  because 

Avi  •  Avj  —  A7 Avi  •  Vj  —  0  •  Vj  —  0. 


For  i  =  1,  •  •  •  ,k,  define  «,•  e  Mm  by 
Thus  Avi  —  am.  Now 


Hi  =  o,  1Avi. 


Ui  •  Uj  —  O;  ]AVj  •  a,-  ] Av  -,  —  o,  lVj  •  o-  lATAvj 


Jj  ^'  J  ~  w i 


Of  1 V/  •  Oj  1  OJVj  =  (vf  •  Vy)  - 


Thus  is  an  orthonormal  set  of  vectors  in  M"!.  Also, 


AATUi=AArOf  lAvi  =  Of  AAAVj  =  lAOjVj  =  OfUi. 


Now  extend  {M;}i=|  to  an  orthonormal  basis  for  all  of  and  let 

U  =  [Ui  ■■■  um] 

while  V  =  (vi  •  •  •  vn) .  Thus  U  is  the  matrix  which  has  the  a,  as  columns  and  V  is  defined  as  the  matrix 
which  has  the  v;  as  columns.  Then 


UtAV  = 


[  GlUl 


Okiik  0 


3] 


U 


T 

m  . 


o  0 
0  0 
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where  a  is  given  in  the  statement  of  the  theorem.  4k 

The  singular  value  decomposition  has  as  an  immediate  corollary  which  is  given  in  the  following  inter¬ 
esting  result. 


Corollary  7.67 :  Rank  and  Singular  Values 


Let  A  be  an  m  x  n  matrix.  Then  the  rank  of  A  and  Ar  equals  the  number  of  singular  values. 


Let’s  compute  the  Singular  Value  Decomposition  of  a  simple  matrix. 


Example  7.68:  Singular  Value  Decomposition 

Let  A  — 

'  1  -1  3  ' 
3  1  1 

.  Find  the  Singular  Value  Decomposition  (SVD)  of  A. 

Solution.  To  begin,  we  compute  AAr  and  A7 A. 


AA7  = 


1  3 
1  1 


A7  A  = 


1  3 
-1  1 
3  1 


1 

3 


1  3 
-1  1 
3  1 


-1  3 
1  1 


11  5 

5  11 


10  2  6 
2  2-2 
6  -2  10 


Since  AA7  is  2  x  2  while  A7 A  is  3  x  3,  and  AA7  and  A7 A  have  the  same  nonzero  eigenvalues  (by 
Proposition  7.65),  we  compute  the  characteristic  polynomial  Cpjpr  (x)  (because  it’s  easier  to  compute  than 

CAT A  (-*))• 


Caat{x) 


det(x/  —  AA7) 


x-n  -5 

-5  x  — 11 


(x  —  ll)2  —  25 
x2  —  22x+  121  —  25 
x2  —  22x  +  96 


(x  —  16)  (x  —  6) 


Therefore,  the  eigenvalues  of  AA7  are  Aj  =  16  and  /L  =  6. 

The  eigenvalues  of  A7  A  are  Ai  —  16,  A2  =  6,  and  A3  =  0,  and  the  singular  values  of  A  are  cji  =  \/l6  =  4 
and  C2  =  a/6.  By  convention,  we  list  the  eigenvalues  (and  corresponding  singular  values)  in  non  increasing 
order  (i.e.,  from  largest  to  smallest). 

To  find  the  matrix  V : 

To  construct  the  matrix  V  we  need  to  find  eigenvectors  for  A7 A.  Since  the  eigenvalues  of  AA7  are 
distinct,  the  corresponding  eigenvectors  are  orthogonal,  and  we  need  only  normalize  them. 
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A|  =  16:  solve  (16 1-ArA)Y  =  0. 


6 

-2 

-6 

0  ' 

'  1 

0 

-1 

0  ' 

t  " 

'  1 ' 

-2 

14 

2 

0 

-A 

0 

1 

0 

0 

,  so  Y  — 

0 

=  t 

0 

-6 

2 

6 

0 

0 

0 

0 

0 

t 

1 

A2  =  6:  solve  (6/  —  ArA)Y  =  0. 


■  _4  _2  -6 

0  ' 

'10  1 

0  ' 

—s 

"  -1 ' 

-2  4  2 

0 

->• 

0  1  1 

0 

,  so  Y  — 

—s 

=  s 

-1 

-6  2  -4 

0 

0  0  0 

0 

s 

1 

A3  =  0:  solve  (—ArA)Y  =  0. 


"  -10  -2  -6 

0  ' 

'10  1 

0  ' 

— r 

"  -1  ' 

-2  -2  2 

0 

-A 

0  1  -2 

0 

,  so  Y  — 

2  r 

=  r 

2 

-6  2  -10 

0 

0  0  0 

0 

r 

1 

Let 


Then 


Also, 


Vl  =  A 


"  1  ' 

1 

'  -1 ' 

,v3  =  -7 

"  -1 ' 

0 

,V2  = 

-1 

2 

1 

7! 

1 

V6 

1 

j  [  V3  -V2  -1 
L  =  ^  0  -V2  2 

V6  V3  V2  1 


4  0  0 

0  V6  0 


and  we  use  A,  Vr,  and  Z  to  find  U . 

Since  V  is  orthogonal  and  A  =  UHVT ,  it  follows  that  AV  =  C/E.  Let  V  =  [  Vi  V2  V3  ],  and  let 
U  =  [  U\  U2  ] ,  where  C/i  and  C/2  are  the  two  columns  of  U. 

Then  we  have 


A  [Vi  V2  V3] 
[  AVi  AV2  AV3  ] 


[Ui  U2  ]  E 

[01U1+  0U2  0Ui  +  o2U2  0 Ui  +  0C/2  ] 
[  of  Ui  o2U2  0  ] 


which  implies  that  AL|  ~  G\U\  ~  4U\  and  AV2  —  o2U2  —  V6U2. 

Thus, 

1  ' 

1  ’ 


1  1 
^  =  4^  =  4 


1  -1  3 

1 

± 

0 

1 

1 

4 

1 

3  1  1 

72 

~  4a/2 

4 

"72 
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and 

^-Lav2 

Therefore, 

and 


1 

'  1 

-1  3  ' 

1 

-1 

-1 

1 

3  ' 

1 

1  ' 

76 

[  3 

1  1 

73 

1 

372 

[  -3  J 

72 

-1 

u= 4= 

V2 


1  1 
1  -1 


A 


1  -1  3 
3  1  1 


1  1 
1  -1 


4  0  0 

0  76  0 


73  0  73  1  \ 

-72  -72  72 

— !  2  1  J  / 


Here  is  another  example. 


* 


Example  7.69:  Finding  the  SVD 

Find  an  SVD  for  A  — 

"  -1  ' 

2 

2 

Solution.  Since  A  is  3  x  1,  A7  A  is  a  1  x  1  matrix  whose  eigenvalues  are  easier  to  find  than  the  eigenvalues 
of  the  3x3  matrix  AAr . 


AT  A  =  [  - 1  2  2] 


-1 

2 

2 


[9]- 


Thus  A7  A  has  eigenvalue  A]  —  9,  and  the  eigenvalues  of  A  A1  are  A]  =  9,  Xi  —  0,  and  A3  =  0.  Further¬ 
more,  A  has  only  one  singular  value,  Ci  =  3. 

To  find  the  matrix  V :  To  do  so  we  find  an  eigenvector  for  ArA  and  normalize  it.  In  this  case,  finding 
a  unit  eigenvector  is  trivial:  Vi  —  [  1  ] ,  and 

V=[l], 


Also,  Z  = 

Now  AV  = 
U.  Thus 


3 

0  ,  and  we  use  A,  VT ,  and  E  to  find  U . 

_  0  _ 

ITL,  with  V  =  [  Vi  ] ,  and  U  =  [  Ux  U2 


f/3  ] ,  where  U\,  U2,  and  C/3  are  the  columns  of 


A  [  Vi  ]  =  [Ui  U2  U3]Z 
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[  AVi  ]  =  [  cJit/i+01/2  +  0f/3  ] 

=  [  CTtt/!  ] 

This  gives  us  AV\  —  0\U\  —  3U\,  so 


U  i 


1 

3 


'  -1 ' 

[  1  ]  =| 

'  -1  ' 

2 

2 

2 

2 

The  vectors  U2  and  C/3  are  eigenvectors  of  AAr  corresponding  to  the  eigenvalue  A2  =  A3  =  0.  Instead 
of  solving  the  system  (0 1  —  AAr)X  =  0  and  then  using  the  Gram-Schmidt  process  on  the  resulting  set  of 
two  basic  eigenvectors,  the  following  approach  may  be  used. 

Find  vectors  U2  and  C/3  by  first  extending  {U\  }  to  a  basis  of  M,  then  using  the  Gram-Schmidt  algorithm 
to  orthogonalize  the  basis,  and  finally  normalizing  the  vectors. 

Starting  with  {3U\}  instead  of  {C/i }  makes  the  arithmetic  a  bit  easier.  It  is  easy  to  verify  that 


is  a  basis  of  M3.  Set 


Ei  = 

'  -1  ' 
2 

,x2  = 

"  1  ' 
0 

,X3  = 

'  0  ' 
1 

2 

0 

0 

and  apply  the  Gram-Schmidt  algorithm  to  {E\,X2,X2,} . 
This  gives  us 


Therefore, 


and 


'  4  ' 

0  ' 

E2  = 

1 

and  £3  = 

1 

1 

-1 

U2  = 


\/l8 


'  4  ' 

1 

72 

0  ' 

1 

1 

,U3  = 

1 

-1 

u  = 


4 

y/l8 

7TI 


0 

J_ 

f 

V2 


Finally, 
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4 

yh8 

\/T8 

Vl8 


0 

J_ 

f 


'  3  ' 

0 

0 

[1] 


Consider  another  example. 


* 


Example  7.70:  Find  the  SVD 

Find  a  singular  value  decomposition  fo 

A  = 

r  the  matrix 

'  \V2V5  IV2V5  0  ' 
_  ]VlV5  IV2V5  0  _ 

First  consider  A7  A 

r  16  32  n  1 

5  5  u 

32  64  n 

5  5  U 

0  0  0 


What  are  some  eigenvalues  and  eigenvectors?  Some  computing  shows  these  are 


<H-  16 


Thus  the  matrix  V  is  given  by 


V  = 


^V5  -§v/5  O' 

§V5  ^y/5  0 

0  0  1 


Next  consider  AAr 


8  8 
8  8 


Eigenvectors  and  eigenvalues  are 


L  2 


0, 


\V2 

\V2 


16 


Thus  you  can  let  U  be  given  by 


Lets  check  this.  UTAV  = 


U 


ky/2  ~iy/2] 

kV2  W2  J 


r  y*  yi 

_-lV2  IV2 


IV2V5  IV2V5  0 

jV2V5  ^V2^5  0 


%y/5  -\V5  O' 

IV5  \V5  0 

0  0  1 
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4  0  0 

0  0  0 

This  illustrates  that  if  you  have  a  good  way  to  find  the  eigenvectors  and  eigenvalues  for  a  Hermitian 
matrix  which  has  nonnegative  eigenvalues,  then  you  also  have  a  good  way  to  find  the  singular  value 
decomposition  of  an  arbitrary  matrix. 

7.4.3.  Positive  Definite  Matrices 


Positive  definite  matrices  are  often  encountered  in  applications  such  mechanics  and  statistics. 
We  begin  with  a  definition. 


Definition  7.71:  Positive  Definite  Matrix 


Let  A  be  an  n  x  n  symmetric  matrix.  Then  A  is  positive  definite  if  all  of  its  eigenvalues  are  positive. 


The  relationship  between  a  negative  definite  matrix  and  positive  definite  matrix  is  as  follows. 


Lemma  7.72:  Negative  Definite  Matrix 


An  nxn  matrix  A  is  negative  definite  if  and  only  if —A  is  positive  definite. 


Consider  the  following  lemma. 


Lemma  7.73:  Positive  Definite  Matrix  and  Invertibility 


If  A  is  positive  definite,  then  it  is  invertible. 


Proof.  If  Av  =  0,  then  0  is  an  eigenvalue  if  v  is  nonzero,  which  does  not  happen  for  a  positive  definite 
matrix.  Hence  v  =  0  and  so  A  is  one  to  one.  This  is  sufficient  to  conclude  that  it  is  invertible.  4k 

Notice  that  this  lemma  implies  that  if  a  matrix  A  is  positive  definite,  then  det(A)  >  0. 

The  following  theorem  provides  another  characterization  of  positive  definite  matrices.  It  gives  a  useful 
test  for  verifying  if  a  matrix  is  positive  definite. 


Theorem  7.74:  Positive  Definite  Matrix 


Let  A  be  a  symmetric  matrix.  Then  A  is  positive  definite  if  and  only  if  xT Ax  is  positive  for  all 
nonzero  x  E  M'!. 


Proof.  Since  A  is  symmetric,  there  exists  an  orthogonal  matrix  U  so  that 

UtAU  —  diag(  Ai ,  A.2, . . . ,  An)  —  D, 
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where  Ai,A2, 
y  —  UTx.  Then 


. ,  A„  are  the  (not  necessarily  distinct)  eigenvalues  of  A.  Let  x  G  M'\  x^O,  and  define 
xT  Ax  =  xT(U  DUt)x  =  ( xTU)D(UTx)=yTDy . 


Writing  yr  =  [  yi  y2  ■  ■  ■  y«  ] , 


xrAT  =  [  yi  y2  ■■■  ]  diag(Ai,A2,...,A 

—  Aj  y  1  +  A2y2  4 - A„y“ . 

(=^)  First  we  will  assume  that  A  is  positive  definite  and  prove  that  xT Ax  is  positive. 

Suppose  A  is  positive  definite,  and  x  G  M'\  x  7^  0.  Since  UT  is  invertible,  y  =  UTx  7^  0,  and  thus  y;-  7^  0 
for  some  j,  implying  d>  0  for  some  j.  Furthermore,  since  all  eigenvalues  of  A  are  positive,  A ,-y?  >  0  for 
all  i  and  ’kjy1-  >  0.  Therefore,  xrAx  >  0. 

(<=)  Now  we  will  assume  xT Ax  is  positive  and  show  that  A  is  positive  definite. 

If  xT Ax  >  0  whenever  x  7^  0,  choose  x  —  Uej,  where  e.  j  is  the  column  of  /„.  Since  U  is  invertible, 
x  7^  0,  and  thus 

$=UTx  =  UT(Uej)  =  ej. 

Thus  yj  —  1  and  y,-  =  0  when  i  /  j,  so 

Aiyf  +  A2y2  +  •  ■  ■  A,7y,^  =  A j, 

i.e.,  A /  =  xrAx  >  0.  Therefore,  A  is  positive  definite.  4 

There  are  some  other  very  interesting  consequences  which  result  from  a  matrix  being  positive  defi¬ 
nite.  First  one  can  note  that  the  property  of  being  positive  definite  is  transferred  to  each  of  the  principal 
submatrices  which  we  will  now  define. 


yi 


yn 


Definition  7.75:  The  Submatrix  Ak 


Let  A  be  an  n  x  n  matrix.  Denote  by  A  /.  thekxk  matrix  obtained  by  deleting  the  k+  1,  •  ■  ■  ,n  columns 
and  the  k+  1 ,  •  ■  ■  ,n  rows  from  A.  Thus  An  —  A  and  A^  is  the  kx  k  submatrix  of  A  which  occupies 
the  upper  left  corner  of  A. 


Lemma  7.76:  Positive  Definite  and  Submatrices 


Let  A  be  an  n  x  n  positive  definite  matrix.  Then  each  submatrix  A^  is  also  positive  definite. 


Proof.  This  follows  right  away  from  the  above  definition.  Let  x  G  M.k  be  nonzero.  Then 


x 

0 


xrAkx  —  [  xT  0  ]  A 


>0 
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by  the  assumption  that  A  is  positive  definite.  4k 

There  is  yet  another  way  to  recognize  whether  a  matrix  is  positive  definite  which  is  described  in  terms 
of  these  submatrices.  We  state  the  result,  the  proof  of  which  can  be  found  in  more  advanced  texts. 


Theorem  7.77:  Positive  Matrix  and  Determinant  of  A* 


Let  A  be  a  symmetric  matrix.  Then  A  is  positive  definite  if  and  only  if  det  (A*)  is  greater  than  0  for 
every  submatrix  A&  k—  1 ,  ■  •  • ,  n. 


Proof.  We  prove  this  theorem  by  induction  on  n.  It  is  clearly  true  if  n  —  1.  Suppose  then  that  it  is  true  for 
77  —  1  where  n  >  2.  Since  det  (A)  =  det  (A„)  >  0,  it  follows  that  all  the  eigenvalues  are  nonzero.  We  need  to 
show  that  they  are  all  positive.  Suppose  not.  Then  there  is  some  even  number  of  them  which  are  negative, 
even  because  the  product  of  all  the  eigenvalues  is  known  to  be  positive,  equaling  det  (A).  Pick  two,  X\  and 
X2  and  let  Aw;-  =  A ;«,•  where  w,-  ^  0  for  i  —  1,2  and  u\»u2  —  0.  Now  if  y  =  a,\U\  +  cc2u2  is  an  element  of 
span  {uuu2},  then  since  these  are  eigenvalues  and  u\  •  U2  —  0,  a  short  computation  shows 

(cclul  +  a2u2)  A(a\u\-\- a2u2) 

=  \  cc\\2 Xi\\Ui\\~  +  \  tx2\~ X2\\u2\\~  <  0- 

Now  letting  x  G  M"-1,  we  can  use  the  induction  hypothesis  to  write 

=  xrA„_ \x  >  0. 

Now  the  dimension  of  {z  G  R"  :  zn  —  0}  is  n  —  1  and  the  dimension  of  span {u\,  u2}  =  2  and  so  there  must 
be  some  nonzero  x  G  R"  which  is  in  both  of  these  subspaces  of  M'1.  However,  the  first  computation  would 
require  that  xT Ax  <  0  while  the  second  would  require  that  xT Ax  >  0.  This  contradiction  shows  that  all  the 
eigenvalues  must  be  positive.  This  proves  the  if  part  of  the  theorem.  The  converse  can  also  be  shown  to 
be  correct,  but  it  is  the  direction  which  was  just  shown  which  is  of  most  interest.  4k 


[xT  0  ]  A 


Proof.  This  is  immediate  from  the  above  theorem  when  we  notice,  that  A  is  negative  definite  if  and  only 
if  —A  is  positive  definite.  Therefore,  if  det  ( — A*)  >  0  for  all  k  —  1,  •  •  •  ,n,  it  follows  that  A  is  negative 
definite.  However,  det(— A*)  =  (—  1  /  det  (A^) .  4k 
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7.4.3.I.  The  Cholesky  Factorization 


Another  important  theorem  is  the  existence  of  a  specific  factorization  of  positive  definite  matrices.  It  is 
called  the  Cholesky  Factorization  and  factors  the  matrix  into  the  product  of  an  upper  triangular  matrix  and 
its  transpose. 


Theorem  7.79:  Cholesky  Factorization 


Let  A  be  a  positive  definite  matrix.  Then  there  exists  an  upper  triangular  matrix  U  whose  main 
diagonal  entries  are  positive,  such  that  A  can  be  written 

A  =  UtU 


This  factorization  is  unique. 


The  process  for  finding  such  a  matrix  U  relies  on  simple  row  operations. 


Procedure  7.80:  Finding  the  Cholesky  Factorization 


Let  A  be  a  positive  definite  matrix.  The  matrix  U  that  creates  the  Cholesky  Factorization  can  be 
found  through  two  steps. 

1 .  Using  only  type  3  elementary  row  operations  (multiples  of  rows  added  to  other  rows )  put  A  in 
upper  triangular  form.  Call  this  matrix  0 .  Then  0  has  positive  entries  on  the  main  diagonal. 

2.  Divide  each  row  of  U  by  the  square  root  of  the  diagonal  entry  in  that  row.  The  result  is  the 
matrix  U. 


Of  course  you  can  always  verify  that  your  factorization  is  correct  by  multiplying  U  and  UT  to  ensure 
the  result  is  the  original  matrix  A. 

Consider  the  following  example. 


Example  7.81:  Cholesky  Factorization 

Show  that  A  = 

'9-6  3  ' 

-6  5  -3 

3-3  6 

is  positive  definite,  and  find  the  Cholesky  factorization  of  A. 

Solution.  First  we  show  that  A  is  positive  definite.  By  Theorem  7.77  it  suffices  to  show  that  the  determinant 
of  each  submatrix  is  positive. 

M  =  [  9  ]  andA2  =  ^  , 

so  det(A])  =  9  and  det(A2)  =  9.  Since  det(A)  =  36,  it  follows  that  A  is  positive  definite. 
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Now  we  use  Procedure  7.80  to  find  the  Cholesky  Factorization.  Row  reduce  (using  only  type  3  row 
operations)  until  an  upper  triangular  matrix  is  obtained. 


9 

-6 

3  ' 

'  9 

-6 

3  ' 

'  9 

-6 

3  ' 

-6 

5 

-3 

->■ 

0 

1 

-1 

0 

1 

-1 

3 

-3 

6 

0 

-1 

5 

0 

0 

4 

Now  divide  the  entries  in  each  row  by  the  square  root  of  the  diagonal  entry  in  that  row,  to  give 


U  = 


3  -2  1 

0  1  -1 
0  0  2 


You  can  verify  that  UTU  =  A. 


* 


Solution.  You  can  verify  that  A  is  in  fact  positive  definite. 

To  find  the  Cholesky  factorization  we  first  row  reduce  to  an  upper  triangular  matrix. 


"  3 

1 

1  ' 

'  3 

0 

1 

n 

1 

5 

'  3 

0 

1 

li 

1 

5 

1 

4 

2 

3 

3 

3 

3 

1 

2 

5 

0 

5 

3 

14 

5 

0 

0 

43 

11 

Now  divide  the  entries  in  each  row  by  the  square  root  of  the  diagonal  entry  in  that  row  and  simplify. 


U 


y/3  ^\/3  j\/3 

o  iv^v/rr 
o  o  ^v/ny^ 


* 
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7.4.4.  QR  Factorization 


In  this  section,  a  reliable  factorization  of  matrices  is  studied.  Called  the  QR  factorization  of  a  matrix,  it 
always  exists.  While  much  can  be  said  about  the  QR  factorization,  this  section  will  be  limited  to  real 
matrices.  Therefore  we  assume  the  dot  product  used  below  is  the  usual  dot  product.  We  begin  with  a 
definition. 


Definition  7.83:  QR  Factorization 


Let  A  be  a  real  m  x  n  matrix.  Then  a  QR  factorization  of  A  consists  of  two  matrices,  Q  orthogonal 
and  R  upper  triangular,  such  that  A  =  QR. 


The  following  theorem  claims  that  such  a  factorization  exists. 


Theorem  7.84:  Existence  of  QR  Factorization 


Let  A  be  any  real  mxn  matrix  with  linearly  independent  columns.  Then  there  exists  an  orthogonal 
matrix  Q  and  an  upper  triangular  matrix  R  having  non-negative  entries  on  the  main  diagonal  such 
that 

A  —  QR 


The  procedure  for  obtaining  the  QR  factorization  for  any  matrix  A  is  as  follows. 
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Notice  that  Q  is  an  orthogonal  matrix  as  the  C,  form  an  orthonormal  set.  Since  1 1  Bt  \ \  >  0  for  all  i  (since 
the  length  of  a  vector  is  always  positive),  it  follows  that  R  is  an  upper  triangular  matrix  with  positive  entries 
on  the  main  diagonal. 

Consider  the  following  example. 


Solution.  First,  observe  that  A\,  An,  the  columns  of  A,  are  linearly  independent.  Therefore  we  can  use  the 
Gram-Schmidt  Process  to  create  a  corresponding  orthogonal  set  as  follows: 


B  i  —  A]  — 


1 

0 

1 


=  A2-A^Bi 


2 

1 

0 


IM 
2 

~2 


1 

0 

1 


Normalize  each  vector  to  create  the  set  {Ci,C2}  as  follows: 


C|  l|Sll|B|  V2 


C2  IIB2II'®2  V3 


1 

0 

1 

1 

1 

-1 


Now  construct  the  orthogonal  matrix  Q  as 


Q  =  [  C\  C2  ■■■  Cn  ] 
1  1 


yft 

0 

J_ 

Vi 


y/3 

J_ 

vA 

j_ 

V3 
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Finally,  construct  the  upper  triangular  matrix  R  as 

d  _  ll^ill 

L  0  11^2 

'  Vi  Vi' 
o  Vi 

It  is  left  to  the  reader  to  verify  that  A  =  QR.  4k 

7.4.4. 1.  The  QR  Factorization  and  Eigenvalues 


The  QR  factorization  of  a  matrix  has  a  very  useful  application.  It  turns  out  that  it  can  be  used  repeatedly 
to  estimate  the  eigenvalues  of  a  matrix.  Consider  the  following  procedure. 


Procedure  7.87:  Using  the  QR  Factorization  to  Estimate  Eigenvalues 


Let  A  be  an  invertible  matrix.  Define  the  matrices  A\,A2,  ■■■  as  follows: 

1.  Ai—A  factored  as  A\  —  Q\R\ 

2.  A2  —  R\Q\  factored  as  A2  —  Q2R2 

3.  A3  =  R2Q2  factored  as  A3  =  Q3R3 

Continue  in  this  manner,  where  in  general  A&  =  QkRk  andAk+ 1  =  RkQk- 

Then  it  follows  that  this  sequence  ofAj  converges  to  an  upper  triangular  matrix  which  is  similar  to 
A.  Therefore  the  eigenvalues  of  A  can  be  approximated  by  the  entries  on  the  main  diagonal  of  this 
upper  triangular  matrix. 


7.4.4.2.  Power  Methods 


While  the  QR  algorithm  can  be  used  to  compute  eigenvalues,  there  is  a  useful  and  fairly  elementary  tech¬ 
nique  for  finding  the  eigenvector  and  associated  eigenvalue  nearest  to  a  given  complex  number  which  is 
called  the  shifted  inverse  power  method.  It  tends  to  work  extremely  well  provided  you  start  with  something 
which  is  fairly  close  to  an  eigenvalue. 

Power  methods  are  based  the  consideration  of  powers  of  a  given  matrix.  Let  {x] ,  •  •  •  ,x„}  be  a  basis 
of  eigenvectors  for  C"  such  that  Axn  =  hnxn.  Now  let  u\  be  some  nonzero  vector.  Since  {x\,  ■  ■  ■  ,xn}  is  a 
basis,  there  exists  unique  scalars,  c,-  such  that 

n 

U\  ^  CkXk 

k=\ 

Assume  you  have  not  been  so  unlucky  as  to  pick  U\  in  such  a  way  that  cn  —  0.  Then  let  AUk  —  Uk+i  so  that 

n—  1 

Um  =  AmUl  =  £  Ck?ik%  +  KCrVn- 
k=  1 


(7.7) 
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For  large  m  the  last  term,  A"lcnxn,  determines  quite  well  the  direction  of  the  vector  on  the  right.  This  is 
because  A„  |  is  larger  than  |  A*)  for  k  <  n  and  so  for  a  large  m,  the  sum,  on  the  right  is  fairly 

insignihcant.  Therefore,  for  large  m,  u,n  is  essentially  a  multiple  of  the  eigenvector  xn,  the  one  which  goes 
with  An.  The  only  problem  is  that  there  is  no  control  of  the  size  of  the  vectors  um .  You  can  fix  this  by 
scaling.  Let  52  denote  the  entry  of  Au\  which  is  largest  in  absolute  value.  We  call  this  a  scaling  factor. 
Then  112  will  not  be  just  Au\  but  Au\  jSi_-  Next  let  53  denote  the  entry  of  AUi  which  has  largest  absolute 
value  and  define  M3  =  AU2/S3.  Continue  this  way.  The  scaling  just  described  does  not  destroy  the  relative 
insignificance  of  the  term  involving  a  sum  in  7.7.  Indeed  it  amounts  to  nothing  more  than  changing  the 
units  of  length.  Also  note  that  from  this  scaling  procedure,  the  absolute  value  of  the  largest  element  of  Uk 
is  always  equal  to  1.  Therefore,  for  large  m, 


A  1  x 

S2S3 • '  '  Sm 


+  (relatively  insignificant  term) . 


Therefore,  the  entry  of  Aum  which  has  the  largest  absolute  value  is  essentially  equal  to  the  entry  having 
largest  absolute  value  of 


Xmc  x 


AIn+1 


CinXi'i 


X2S3  ■■■Sm)  S2S3  ■  ■  ■  s„ 

and  so  for  large  m,  it  must  be  the  case  that  A„  ~  Sm+\.  This  suggests  the  following  procedure. 


Procedure  7.88:  Finding  the  Largest  Eigenvalue  with  its  Eigenvector 


1.  Start  with  a  vector  U\  which  you  hope  has  a  component  in  the  direction  ofxn.  The  vector 
( 1 ,  ■  •  • ,  1 )  is  usually  a  pretty  good  choice. 

2.  If  Uk  is  known, 

Am 

Uk+ 1  =  ^ — 

^k+ 1 

where  Sk+ 1  is  the  entry  ofAUk  which  has  largest  absolute  value. 

3.  When  the  scaling  factors,  Sk  are  not  changing  much,  Sk+ 1  will  be  close  to  the  eigenvalue  and 
Uk+i  will  be  close  to  an  eigenvector. 

4.  Check  your  answer  to  see  if  it  worked  well. 


The  shifted  inverse  power  method  involves  finding  the  eigenvalue  closest  to  a  given  complex  number 
along  with  the  associated  eigenvalue.  If  p  is  a  complex  number  and  you  want  to  find  A  which  is  closest  to 
p,  you  could  consider  the  eigenvalues  and  eigenvectors  of  (A  —  pi)  ~ 1 .  Then  Ax  —  Ax  if  and  only  if 


(A  —  pl)x—  (A  —p)x 


If  and  only  if 

IL-3=(A-mrlx 

Thus,  if  A  is  the  closest  eigenvalue  of  A  to  p  then  out  of  all  eigenvalues  of  (A  —  pl)~  ,  you  would  have 
jA-j  would  be  the  largest.  Thus  all  you  have  to  do  is  apply  the  power  method  to  (A  —  pi)1  and  the 
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eigenvector  you  get  will  be  the  eigenvector  which  corresponds  to  A  where  A  is  the  closest  to  /i  of  all 
eigenvalues  of  A.  You  could  use  the  eigenvector  to  determine  this  directly. 


Solution.  Form 


3 

-2 

-2 


2 

0 

-2 


1 

-1 

0 


-(.9 +  .9/) 


1  0  0 
0  1  0 
0  0  1 


-1 


—0.619 19  —  10.545/ 
5.5249  +  4.9724/ 
0.74114+11.643/ 


-5.5249-4.9724/ 
5.2762  +  0.24862/ 
5.5249  +  4.9724/ 


-0.37057-5.8213/ 
2.7624  +  2.4862/ 
0.49252  +  6.9189/ 


Then  pick  an  initial  guess  an  multiply  by  this  matrix  raised  to  a  large  power. 


"  —0.619 19  —  10.545/ 

-5.5249-4.9724/ 

-0.37057-5.8213/  ' 

15 

'  1  ' 

5.5249  +  4.9724/ 

5.2762  +  0.24862/ 

2.7624  +  2.4862/ 

1 

0.74114+11.643/ 

5.5249  +  4.9724/ 

0.49252  +  6.9189/ 

1 

This  equals 

f  1.5629  x  1013-3. 8993  x  1012/ 
-5.8645  x  1012  +  9.7642  x  1012/ 

_  -1.5629  x  1013  +  3. 8999  x  1012/  _ 

Now  divide  by  an  entry  to  make  the  vector  have  reasonable  size.  This  yields 

"  —0.99999  —  3.6140  x  10-5/  ' 
0.49999-0.49999/ 

TO 

which  is  close  to 


-1 

0.5  —  0.5/ 
1.0 


'3  2  1  ' 

-1 

i 

p 

1 

p 

7 

-2  0  -1 

o 

Ln 

1 

o 

La 

= 

1.0 

-2  -2  0 

TO 

1.0+ 1.0/ 

Then 


7.4.  Orthogonality  427 


Now  to  determine  the  eigenvalue,  you  could  just  take  the  ratio  of  corresponding  entries.  Pick  the  two 
corresponding  entries  which  have  the  largest  absolute  values.  In  this  case,  you  would  get  the  eigenvalue  is 
1  +  i  which  happens  to  be  the  exact  eigenvalue.  Thus  an  eigenvector  and  eigenvalue  are 


-1 

0.5 -0.5  i 
1.0 


,  1  +  i 


* 

Usually  it  won’t  work  out  so  well  but  you  can  still  find  what  is  desired.  Thus,  once  you  have  obtained 
approximate  eigenvalues  using  the  QR  algorithm,  you  can  find  the  eigenvalue  more  exactly  along  with  an 
eigenvector  associated  with  it  by  using  the  shifted  inverse  power  method. 

7.4.5.  Quadratic  Forms 


One  of  the  applications  of  orthogonal  diagonalization  is  that  of  quadratic  forms  and  graphs  of  level  curves 
of  a  quadratic  form.  This  section  has  to  do  with  rotation  of  axes  so  that  with  respect  to  the  new  axes, 
the  graph  of  the  level  curve  of  a  quadratic  form  is  oriented  parallel  to  the  coordinate  axes.  This  makes 
it  much  easier  to  understand.  For  example,  we  all  know  that  x2  +  x2  —  1  represents  the  equation  in  two 
variables  whose  graph  in  M2  is  a  circle  of  radius  1.  But  how  do  we  know  what  the  graph  of  the  equation 
5x\  +  4x1*2  +  3x2  =  1  represents? 

We  first  formally  define  what  is  meant  by  a  quadratic  form.  In  this  section  we  will  work  with  only  real 
quadratic  forms,  which  means  that  the  coefficients  will  all  be  real  numbers. 


Definition  7.90:  Quadratic  Form 

A  quadratic  form  is  a  polynomial  of  degree  two  in  n  variables  *i,*2,-  ■ 
combination  ofxj  terms  and  x,Xj  terms. 

■  ,*„,  written  as  a  linear 

Consider  the  quadratic  form  q  =  a\  \  a2  +  «22-U  H - b  annxft  +  a 1 2*1*2  H - •  We  can  write  *  = 

as  the  vector  whose  entries  are  the  variables  contained  in  the  quadratic  form. 


*1 

X2 


Similarly,  let  A  — 


a  11 

a  12  ' 

d  hi 

«21 

«22  • 

Cl2n 

an\ 

ani 

Clnn 

be  the  matrix  whose  entries  are  the  coefficients  of  x2  and 


XjXj  from  q.  Note  that  the  matrix  A  is  not  unique,  and  we  will  consider  this  further  in  the  example  below. 
Using  this  matrix  A,  the  quadratic  form  can  be  written  as  q  =  xT  Ax. 


q 


=  xT  Ax 
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=  [  *1  x2  ■■■  xn  ] 


a\\  a  12  •••  a\n 

<321  <322  •  •  •  <32  n 


Xl 

*2 


<3/71  <3,;2 


<3/7/1 


X 


n 


=  [  Xl  x2  ■■■  xn  ] 


<311*1  +  <321*2  H - h<3//l*/7 

<3 12*1  +  <322*2  H - ha, ,2*// 


d\n%\  +  ^2n^2  H-  *  *  *  H-  ^nn^n 
—  <3 11*1  "h  a22*2  T  ‘  ‘  '  +  <3/m*^  +<312*1*2  +  '  '  ' 


Let’s  explore  how  to  find  this  matrix  A.  Consider  the  following  example. 


Solution.  First,  let  *  = 


*1 

*2 


and  A 


Then,  writing  q  —  xT Ax  gives 


<311  <312 

<321  <322 


<7 


a\\  <312 

*1 

_  «21  <322 

.  *2  _ 

<311*1  +  <321*1*2  +  <312*1*2  +  <322*2 


Notice  that  we  have  an  *1*2  term  as  well  as  an  *2*1  term.  Since  multiplication  is  commutative,  these 
terms  can  be  combined.  This  means  that  q  can  be  written 

q  —  <3n*J  +  (<321  +<312)  *1*2  +  <322*2 


Equating  this  to  q  as  given  in  the  example,  we  have 

<3ll*i  +  (a21  +  <312)  *1*2  +  <322*2  =  6*j  +  4*1*2  +  3*2 


Therefore, 


a\\  =6 

«22  =  3 
«21  +<312  =  4 


This  demonstrates  that  the  matrix  A  is  not  unique,  as  there  are  several  correct  solutions  to  «2i  +  «I2  =  4. 
However,  we  will  always  choose  the  coefficients  such  that  a2\  =  <312  =  \{a2\  +a\2).  This  results  in 
<321  =  «i2  =  2.  This  choice  is  key,  as  it  will  ensure  that  A  turns  out  to  be  a  symmetric  matrix. 
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Hence, 

6  2 
2  3 

You  can  verify  that  q  =  xT  Ax  holds  for  this  choice  of  A. 


A  = 


an  an 
an  a22 


* 


The  above  procedure  for  choosing  A  to  be  symmetric  applies  for  any  quadratic  form  q.  We  will  always 
choose  coefficients  such  that  a(/  =  ajt. 

We  now  turn  our  attention  to  the  focus  of  this  section.  Our  goal  is  to  start  with  a  quadratic  form  q 
as  given  above  and  find  a  way  to  rewrite  it  to  eliminate  the  XjXj  terms.  This  is  done  through  a  change  of 
variables.  In  other  words,  we  wish  to  find  y,  such  that 

q  =  dny\  +  d2iy\  H - V  dnny\ 


and  I)  —  \dij\ ,  we  can  write  q  =  yT  Dy  where  D  is  the  matrix  of  coefficients  from  q. 

There  is  something  special  about  this  matrix  D  that  is  crucial.  Since  no  y{yj  terms  exist  in  q,  it  follows 
that  djj  —  0  for  all  i  ^  j.  Therefore,  D  is  a  diagonal  matrix.  Through  this  change  of  variables,  we  find  the 
principal  axes  y\,y2,  •  •  ■  ,yn  °f  the  quadratic  form. 

This  discussion  sets  the  stage  for  the  following  essential  theorem. 


Letting  y  = 


>’i 

>’2 

}’n 


r - 1 

Theorem  7.92:  Diagonalizing  a  Quadratic  Form 

Let  q  be  a  quadratic  form  in  the  variables  x\,-  ■ 

,  xn  ■ 

It  follows  that  q  can  be  written  in  the  form 

q  —  xT Ax  where 

Xl 

X2 

x  — 

Xn 

and  A  =  [aq]  is  the  symmetric  matrix  of  coefficients  of  q. 

New  variables  y\,y2,  ■  ■  •  ,yn  can  be  found  such  that  q  = 

■  yT Dy  where 

yt 

yi 

y  = 

yn 

and  D  —  [djj]  is  a  diagonal  matrix.  The  matrix  D  contains  the  eigenvalues  of  A  and  is  found  by 

orthogonally  diagonalizing  A. 

J 

While  not  a  formal  proof,  the  following  discussion  should  convince  you  that  the  above  theorem  holds. 
Let  q  be  a  quadratic  form  in  the  variables  x\,  ■  ■  ■  ,xn.  Then,  q  can  be  written  in  the  form  q  =  xT Ax  for  a 
symmetric  matrix  A.  By  Theorem  7.54  we  can  orthogonally  diagonalize  the  matrix  A  such  that  UT AU  —  D 
for  an  orthogonal  matrix  U  and  diagonal  matrix  D. 
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Then,  the  vector  y 


y  l 

yn 


is  found  by  y  =  UTx.  To  see  that  this  works,  rewrite  y  —  UTx  as  x  —  Uy. 


Letting  q  =  xT Ax,  proceed  as  follows: 


q  —  x  Ax 

=  {Uy)TA(Uy) 
=  f(UTAU)y 
=  fDy 


The  following  procedure  details  the  steps  for  the  change  of  variables  given  in  the  above  theorem. 


Consider  the  following  example. 
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r  i 

Example  7.94:  Choosing  New  Axes  to  Simplify  a  Quadratic  Form 

Consider  the  following  level  curve 

6x\  +  4x[X2  +  3*9  =  7 

shown  in  the  following  graph. 

X2 

r 

N\  X 1 

V 

Use  a  change  of  variables  to  choose  new  axes 

such  that  the  ellipse  is  oriented  parallel  to  the  new 

coordinate  axes.  In  other  words,  use  a  change  of  variables  to  rewrite  q  to  eliminate  the  x\X2  term. 

Solution.  Notice  that  the  level  curve  is  given  by  q  =  7  for  q  —  6x\  +  4*1*2  +  3*2-  This  is  the  same  quadratic 
form  that  we  examined  earlier  in  Example  7.91.  Therefore  we  know  that  we  can  write  q  =  xT  Ax  for  the 
matrix 

'  6  2 
2  3 


A  = 


Now  we  want  to  orthogonally  diagonalize  A  to  write  UT AU  =  D  for  an  orthogonal  matrix  U  and 
diagonal  matrix  D.  The  details  are  left  to  the  reader,  and  you  can  verify  that  the  resulting  matrices  are 


U  = 


D  = 


_g_ 

J_ 

7  0 
0  2 


V5 

_2_ 

d5 


Next  we  write  y  = 


y  t 

37 


.  It  follows  that  x  —  Uy. 


We  can  now  express  the  quadratic  form  q  in  terms  of  y,  using  the  entries  from  D  as  coefficients  as 
follows: 


q  =  dnA  +  dilA 

=  +  2y\ 


Hence  the  level  curve  can  be  written  ly\  +  2y\  =  7.  The  graph  of  this  equation  is  given  by: 


432  Spectral  Theory 


The  change  of  variables  results  in  new  axes  such  that  with  respect  to  the  new  axes,  the  ellipse  is 
oriented  parallel  to  the  coordinate  axes.  These  are  called  the  principal  axes  of  the  quadratic  form.  4k 

The  following  is  another  example  of  diagonalizing  a  quadratic  form. 


r  ^ 

Example  7.95:  Choosing  New  Axes  to  Simplify  a  Quadratic  Form 

Consider  the  level  curve 

5xi  —  6x1X2  +  5x9  =  8 

shown  in  the  following  graph. 

x2 

/ 

*1 

Q 

Use  a  change  of  variables  to  choose  new  axes 

/ 

such  that  the  ellipse  is  oriented  parallel  to  the  new 

coordinate  axes.  In  other  words,  use  a  change  of  variables  to  rewrite  q  to  eliminate  the  x\x2  term. 

Solution.  First,  express  the  level  curve  as  xT  Ax  where  x  — 
Then  q  —  xT Ax  is  given  by 


*1 

x2 


and  A  is  symmetric.  Let  A 


a  11 

<212 

<221 

a22 

(N 

Q 

1 _ 

Xl 

_  «21  <222 

*2 

=  a\\x]  +  {a\2  +  a2\)x\X2  +  a22X22 


Equating  this  to  the  given  description  for  q,  we  have 

5x\  —  6x1x2  +  5*2  =  flnxf  +  (ai2  +  a2i)xiX2  +<222*2 

This  implies  that  a\\  =  5, <222  =  5  and  in  order  for  A  to  be  symmetric,  <212  =  a2i  —  \(a\2  +  a2\)  —  —3.  The 


7.4.  Orthogonality  433 


result  is  A  = 


5  -3 
-3  5 


.  We  can  write  q  =  xT Ax  as 
[  xi  x2] 


- 1 

Ln 

1 

x\ 

-3  5 

.  X2  . 

=  8 


Next,  orthogonally  diagonalize  the  matrix  A  to  write  UTAU  —  D.  The  details  are  left  to  the  reader  and 
the  necessary  matrices  are  given  by 


U  = 


D  = 


W2  \V2 
W2  -\V2 


2  0 
0  8 


Write  y 


y  t 

J2 


,  such  that  x  —  Uy.  Then  it  follows  that  q  is  given  by 


q  =  d  1 1  y\  +  <722>’2 
—  2y  i  +  8v2 


Therefore  the  level  curve  can  be  written  as  2y\  +  8y^  =  8. 

This  is  an  ellipse  which  is  parallel  to  the  coordinate  axes.  Its  graph  is  of  the  form 


Thus  this  change  of  variables  chooses  new  axes  such  that  with  respect  to  these  new  axes,  the  ellipse  is 
oriented  parallel  to  the  coordinate  axes.  4* 


Exercises 


Exercise  7.4.1  Find  the  eigenvalues  and  an  orthonormal  basis  of  eigenvectors  for  A. 


11 

-1 

-4 

-1 

11 

-4 

-4 

-4 

14 
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Hint:  Two  eigenvalues  are  12  and  18. 

Exercise  7.4.2  Find  the  eigenvalues  and  an  orthonormal  basis  of  eigenvectors  for  A. 

'4  1  -2  ' 

A  =  1  4-2 

-2  -2  7 

Hint:  One  eigenvalue  is  3. 

Exercise  7.4.3  Find  the  eigenvalues  and  an  orthonormal  basis  of  eigenvectors  for  A.  Diagonalize  A  by 
finding  an  orthogonal  matrix  U  and  a  diagonal  matrix  D  such  that  UT AU  =  D. 

'  -1  1  1  ' 

A =  1-11 

1  1  -1 

Hint:  One  eigenvalue  is  -2. 

Exercise  7.4.4  Find  the  eigenvalues  and  an  orthonormal  basis  of  eigenvectors  for  A.  Diagonalize  A  by 
finding  an  orthogonal  matrix  U  and  a  diagonal  matrix  D  such  that  U1 AU  =  D. 

17  -7  -4  " 

-7  17  -4 

-4  -4  14 

Hint:  Two  eigenvalues  are  1 8  and  24. 

Exercise  7.4.5  Find  the  eigenvalues  and  an  orthonormal  basis  of  eigenvectors  for  A.  Diagonalize  A  by 
finding  an  orthogonal  matrix  U  and  a  diagonal  matrix  D  such  that  UT AU  =  D. 

'  13  1  4  ' 

A  =  1  13  4 

4  4  10 

Hint:  Two  eigenvalues  are  12  and  18. 

Exercise  7.4.6  Find  the  eigenvalues  and  an  orthonormal  basis  of  eigenvectors  for  A.  Diagonalize  A  by 
finding  an  orthogonal  matrix  U  and  a  diagonal  matrix  D  such  that  UTAU  =  D. 

~~  I  jsV6\/5  75^ 

A  =  75^6^5  ~f  -TS^ 

AV5  -i-5V6  i 


Hint:  The  eigenvalues  are  —3,  —2, 1. 
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Exercise  7.4.7  Find  the  eigenvalues  and  an  orthonormal  basis  of  eigenvectors  for  A.  Diagonalize  A  by 
finding  an  orthogonal  matrix  U  and  a  diagonal  matrix  D  such  that  U7 AU  =  D. 


A  = 


3  0 
0  | 

0  \ 


0 

1 

2 

3 

2 


Exercise  7.4.8  Find  the  eigenvalues  and  an  orthonormal  basis  of  eigenvectors  for  A.  Diagonalize  A  by 
finding  an  orthogonal  matrix  U  and  a  diagonal  matrix  D  such  that  U7 AU  =  D. 


A  = 


2  0  0 
0  5  1 
0  1  5 


Exercise  7.4.9  Find  the  eigenvalues  and  an  orthonormal  basis  of  eigenvectors  for  A.  Diagonalize  A  by 
finding  an  orthogonal  matrix  U  and  a  diagonal  matrix  D  such  that  U7 AU  =  D. 


f  \y/3y/l  \sfl 
—  y\/3  f 


Hint:  The  eigenvalues  are  0, 2, 2  where  2  is  listed  twice  because  it  is  a  root  of  multiplicity  2. 


Exercise  7.4.10  Find  the  eigenvalues  and  an  orthonormal  basis  of  eigenvectors  for  A.  Diagonalize  A  by 
finding  an  ortho goncd  matrix  U  and  a  diagonal  matrix  D  such  that  U7 AU  =  D. 


A  = 


1  \s/3s/6  ' 

\s/3s/2  I  ^s/2V6 
IV3V6  ±V2V6  \ 


Hint:  The  eigenvalues  are  2, 1,0. 


Exercise  7.4.11  Find  the  eigenvalues  and  an  orthonormal  basis  of  eigenvectors  for  the  matrix 


i  -^V3V6 

IV3V 2  I 

—J2V2V6  — | 
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Hint:  The  eigenvalues  are  1,2, —2. 

Exercise  7.4.12  Find  the  eigenvalues  and  an  orthonormal  basis  of  eigenvectors  for  the  matrix 

4  -JvV5 

-iVSVs  l  -\Ve 

TqV^  — 5^/6 

Hint:  The  eigenvalues  are  —1,2,  —  1  where  —1  is  listed  twice  because  it  has  multiplicity  2  as  a  zero  of 
the  characteristic  equation. 

Exercise  7.4.13  Explain  why  a  matrix  A  is  symmetric  if  and  only  if  there  exists  an  orthogonal  matrix  U 
such  that  A  =  UT DU  for  D  a  diagonal  matrix. 

Exercise  7.4.14  Show  that  if  A  is  a  real  symmetric  matrix  and  A  and  p  are  two  different  eigenvalues,  then 
ifX  is  an  eigenvector  for  A  and  Y  is  an  eigenvector  for  /i,  then  X*Y  =  0.  Also  all  eigenvalues  are  real. 
Supply  reasons  for  each  step  in  the  following  argument.  First 

XXTX  =  {AX)tX  =  XtAX  =  XTAX  =  xTXx  =  JxTx 

and  so  A  =  A.  This  shows  that  all  eigenvalues  are  real.  It  follows  all  the  eigenvectors  are  real.  Why?  Now 
let  X,Y,p  and  A  be  given  as  above. 

A  (X  •  Y)  =A  X*Y  =AX»Y  =  X*AY  =  X*pY  =  p{X*Y)  =p(X*Y) 

and  so 

(k-p)X»Y  =  0 

Why  does  it  follow  that  X  •  Y  =  0? 

Exercise  7.4.15  Find  the  Cholesky  factorization  for  the  matrix 

1  2  0 
2  6  4 

0  4  10 

Exercise  7.4.16  Find  the  Cholesky  factorization  of  the  matrix 

"4  8  O' 

8  17  2 

_  0  2  13  _ 

Exercise  7.4.17  Find  the  Cholesky  factorization  of  the  matrix 

'4  8  O' 

8  20  8 

0  8  20 
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Exercise  7.4.18  Find  the  Cholesky  factorization  of  the  matrix 

'1  2  1  ' 

2  8  10 
1  10  18 


Exercise  7.4.19  Find  the  Cholesky  factorization  of  the  matrix 

"1  2  1  ' 

2  8  10 
1  10  26 


Exercise  7.4.20  Suppose  you  have  a  lower  triangular  matrix  L  and  it  is  invertible.  Show  that  LLr  must 
be  positive  definite. 


Exercise  7.4.21  Using  the  Gram  Schmidt  process  or  the  QR  factorization,  find  an  orthonormal  basis  for 
the  following  span: 


span 


Exercise  7.4.22  Using  the  Gram  Schmidt  process  or  the  QR  factorization,  find  an  orthonormal  basis  for 
the  following  span: 


span  = 


Exercise  7.4.23  Here  are  some  matrices.  Find  a  QR  factorization  for  each. 


1  2  3 

(a) 

0  3  4 

0  0  1 

(b) 

12  1' 

2  1 

(c) 

12' 
-1  2 

( d) 

'll' 

2  3 

'  x/XT 

(e) 

vTT 

2\/IT  -4 


3s/6 

—  s/6 
—s/6 


Hint:  Notice  that  the  columns  are  orthogonal. 
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Exercise  7.4.24  Using  a  computer  algebra  system,  find  a  QR  factorization  for  the  following  matrices. 


(a) 

(b) 

(c) 


1 

3 
2 

1 

4 
2 

1 

3 

1 


1  2 
-2  3 
1  1 

2  1  3 

5-4  3 
1  2  1 


2 

2 

-4 


Find  the  thin  QR  factorization  of  this  one. 


Exercise  7.4.25  A  quadratic  form  in  three  variables  is  an  expression  of  the  form  a  \  x2  +  aiy2  +  a^z 2  + 
a4  xy  +  a$xz  +  a^yz.  Show  that  every  such  quadratic  form  may  be  written  as 

x 


[x  y  z\A 


y 


z 


where  A  is  a  symmetric  matrix. 


Exercise  7.4.26  Given  a  quadratic  form  in  three  variables,  x,y,  and  z,  show >  there  exists  an  orthogonal 
matrix  U  and  variables  x',y' ,z'  such  that 


X 

'  x'  ' 

y 

_  z  _ 

=  U 

/ 

_  z!  _ 

with  the  property  that  in  terms  of  the  new >  variables,  the  quadratic  form  is 

Al  (V)  +  fz  (/)  +  A3  (V) 


where  the  numbers,  Ai,  A2,  and  A3  are  the  eigenvalues  of  the  matrix  A  in  Problem  7.4.25. 

Exercise  7.4.27  Consider  the  quadratic  form  q  given  by  q  —  3xj  —  1 2x  \  X2  —  2x\. 

(a)  Write  q  in  the  form  xT  Ax  for  an  appropriate  symmetric  matrix  A. 

(b)  Use  a  change  of  variables  to  rewrite  q  to  eliminate  the  x  \  X2  term. 


Exercise  7.4.28  Consider  the  quadratic  form  q  given  by  q  —  —  2xj  +  2x  \  X2  —  2x^. 

(a)  Write  q  in  the  form  x7 Ax  for  an  appropriate  symmetric  matrix  A. 

(b)  Use  a  change  of  variables  to  rewrite  q  to  eliminate  the  x\xi  term. 

Exercise  7.4.29  Consider  the  quadratic  form  q  given  by  q  —  lx\  +  6x1x2  —  x\. 

(a)  Write  q  in  the  form  x1 Ax  for  an  appropriate  symmetric  matrix  A. 

( b)  Use  a  change  of  variables  to  rewrite  q  to  eliminate  the  x  \  xi  term. 


8.  Some  Curvilinear  Coordinate  Systems 


8.1  Polar  Coordinates  and  Polar  Graphs 


A.  Understand  polar  coordinates. 

B.  Convert  points  between  Cartesian  and  polar  coordinates. 


You  have  likely  encountered  the  Cartesian  coordinate  system  in  many  aspects  of  mathematics.  There 
is  an  alternative  way  to  represent  points  in  space,  called  polar  coordinates.  The  idea  is  suggested  in  the 
following  picture. 


(*.y) 

(r,0) 


X 


Consider  the  point  above,  which  would  be  specified  as  (x,y)  in  Cartesian  coordinates.  We  can  also 
specify  this  point  using  polar  coordinates,  which  we  write  as  (r,  0).  The  number  r  is  the  distance  from 
the  origin  (0,0)  to  the  point,  while  0  is  the  angle  shown  between  the  positive  x  axis  and  the  line  from  the 
origin  to  the  point.  In  this  way,  the  point  can  be  specified  in  polar  coordinates  as  (r,  0). 

Now  suppose  we  are  given  an  ordered  pair  (r,  0)  where  r  and  0  are  real  numbers.  We  want  to  determine 
the  point  specified  by  this  ordered  pair.  We  can  use  0  to  identify  a  ray  from  the  origin  as  follows.  Let  the 
ray  pass  from  (0,0)  through  the  point  (cos  0,  sin  0)  as  shown. 


(■ cos(6),sin(6 )) 
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The  ray  is  identified  on  the  graph  as  the  line  from  the  origin,  through  the  point  (cos(0),sin(0)).  Now 
if  r  >  0,  go  a  distance  equal  to  r  in  the  direction  of  the  displayed  arrow  starting  at  (0,0).  If  r  <  0,  move  in 
the  opposite  direction  a  distance  of  |r|.  This  is  the  point  determined  by  (r,  0). 

It  is  common  to  assume  that  0  is  in  the  interval  [0,  2k)  and  r  >  0.  In  this  case,  there  is  a  very  simple 
relationship  between  the  Cartesian  and  polar  coordinates,  given  by 

x  =  rcos(0),  y  =  rsin(0)  (8.1) 

These  equations  demonstrate  how  to  find  the  Cartesian  coordinates  when  we  are  given  the  polar  coor¬ 
dinates  of  a  point.  They  can  also  be  used  to  find  the  polar  coordinates  when  we  know  (x,y).  A  simpler 
way  to  do  this  is  the  following  equations: 


r=  a/  x2  +  y1 
tan(0)  =  y- 


(8.2) 


In  the  next  example,  we  look  at  how  to  find  the  Cartesian  coordinates  of  a  point  specified  by  polar 
coordinates. 


Solution.  The  point  is  specified  by  the  polar  coordinates  (5,  tt/6).  Therefore  r  —  5  and  0  —  k/6.  From  8.1 


x  =  rcos(0)  =  5 cos  —  ^\/3 


y  —  rsin  (0)  =  5  sin  (  —  )  =  - 


K 


Thus  the  Cartesian  coordinates  are  (|a/3,  |) .  The  point  is  shown  in  the  below  graph. 


* 


Consider  the  following  example  of  the  case  where  r  <  0. 
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Example  8.2:  Finding  Cartesian  Coordinates 


The  polar  coordinates  of  a  point  in  the  plane  are  (— 5,  7t/6)  .  Find  the  Cartesian  coordinates. 


Solution.  For  the  point  specified  by  the  polar  coordinates  (— 5,tt/6),  r  =  —5,  and  xB  —  k/ 6.  From  8.1 


x  =  rcos(0)  =  —5 cos  (p)  =  —  x/3 


y  —  rsin(O)  =  — 5 sin  (  — 


K 


5 

2 


Thus  the  Cartesian  coordinates  are  ( — §  x/3,  —  §)  •  The  point  is  shown  in  the  following  graph. 


(-|V3, 


Recall  from  the  previous  example  that  for  the  point  specified  by  (5, tt/6),  the  Cartesian  coordinates 
are  (f\/3,  |) .  Notice  that  in  this  example,  by  multiplying  r  by  —1,  the  resulting  Cartesian  coordinates  are 
also  multiplied  by  —  1 .  4k 

The  following  picture  exhibits  both  points  in  the  above  two  examples  to  emphasize  how  they  are  just 
on  opposite  sides  of  (0,0)  but  at  the  same  distance  from  (0,0). 


In  the  next  two  examples,  we  look  at  how  to  convert  Cartesian  coordinates  to  polar  coordinates. 


Example  8.3:  Finding  Polar  Coordinates 


Suppose  the  Cartesian  coordinates  of  a  point  are  (3,4).  Find  a  pair  of  polar  coordinates  which 
correspond  to  this  point. 
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Solution.  Using  equation  8.2,  we  can  find  r  and  0.  Hence  r  —  \/32  +  42  =  5.  It  remains  to  identify  the 
angle  0  between  the  positive  x  axis  and  the  line  from  the  origin  to  the  point.  Since  both  the  x  and  y  values 
are  positive,  the  point  is  in  the  first  quadrant.  Therefore,  0  is  between  0  and  n/2  .  Using  this  and  8.2,  we 
have  to  solve: 

tan(0)  =  ^ 

Conversely,  we  can  use  equation  8.1  as  follows: 

3  =  5cos (0) 

4  =  5sin(0) 

Solving  these  equations,  we  find  that,  approximately,  9—0. 927  295  radians.  4 

Consider  the  following  example. 


Example  8.4:  Finding  Polar  Coordinates 


Suppose  the  Cartesian  coordinates  of  a  point  are  ( — x/3, 1 )  .  Find  the  polar  coordinates  which 
correspond  to  this  point. 


Solution.  Given  the  point  (—y/3,  l) , 

r  =  sj  12  +  (-V3)2 

=  vT+3 
=  2 


In  this  case,  the  point  is  in  the  second  quadrant  since  the  x  value  is  negative  and  the  y  value  is  positive. 
Therefore,  9  will  be  between  k/2  and  n.  Solving  the  equations 

—  \/3  =  2cos  (0) 

1  =  2sin(0) 

we  find  that  9  —  5k/ 6.  Hence  the  polar  coordinates  for  this  point  are  (2,  5k /6).  4 


Consider  this  example.  Suppose  we  used  r  —  —2  and  9  —2k  —  (k/ 6)  =  11tt/6.  These  coordinates 
specify  the  same  point  as  above.  Observe  that  there  are  infinitely  many  ways  to  identify  this  particular 
point  with  polar  coordinates.  In  fact,  every  point  can  be  represented  with  polar  coordinates  in  infinitely 
many  ways.  Because  of  this,  it  will  usually  be  the  case  that  0  is  confined  to  lie  in  some  interval  of  length 
2k  and  r  >  0,  for  real  numbers  r  and  0. 

Just  as  with  Cartesian  coordinates,  it  is  possible  to  use  relations  between  the  polar  coordinates  to 
specify  points  in  the  plane.  The  process  of  sketching  the  graphs  of  these  relations  is  very  similar  to  that 
used  to  sketch  graphs  of  functions  in  Cartesian  coordinates.  Consider  a  relation  between  polar  coordinates 
of  the  form,  r  —  f  (0).  To  graph  such  a  relation,  first  make  a  table  of  the  form 


0 

r 

01 

m) 

02 

m) 
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Graph  the  resulting  points  and  connect  them  with  a  curve.  The  following  picture  illustrates  how  to  begin 
this  process. 


To  find  the  point  in  the  plane  corresponding  to  the  ordered  pair  (f(0) ,  0),  we  follow  the  same  process 
as  when  finding  the  point  corresponding  to  (r,  0). 

Consider  the  following  example  of  this  procedure,  incorporating  computer  software. 


Solution.  We  will  use  the  computer  software  Maple  to  complete  this  example.  The  command  which 
produces  the  polar  graph  of  the  above  equation  is:  >  plot(l+cos(t),t=  0..2*Pi,coords=polar).  Here  we  use 
t  to  represent  the  variable  0  for  convenience.  The  command  tells  Maple  that  r  is  given  by  1  +  cos  (?)  and 
that  t  e  [0,2tt]. 


The  above  graph  makes  sense  when  considered  in  terms  of  trigonometric  functions.  Suppose  0  = 
0,r  —  2  and  let  0  increase  to  n/2.  As  0  increases,  cos  0  decreases  to  0.  Thus  the  line  from  the  origin  to  the 
point  on  the  curve  should  get  shorter  as  0  goes  from  0  to  n/2.  As  0  goes  from  n/lioK,  cos  0  decreases, 
eventually  equaling  —  1  at  0  —  n.  Thus  r  —  0  at  this  point.  This  scenario  is  depicted  in  the  above  graph, 
which  shows  a  function  called  a  cardioid. 

The  following  picture  illustrates  the  above  procedure  for  obtaining  the  polar  graph  of  r  =  1  +  cos(0). 
In  this  picture,  the  concentric  circles  correspond  to  values  of  r  while  the  rays  from  the  origin  correspond 
to  the  angles  which  are  shown  on  the  picture.  The  dot  on  the  ray  corresponding  to  the  angle  k/ 6  is  located 
at  a  distance  of  r  =  1  +cos(tt/6)  from  the  origin.  The  dot  on  the  ray  corresponding  to  the  angle  n/3  is 
located  at  a  distance  of  r  —  1  +cos(tt/3)  from  the  origin  and  so  forth.  The  polar  graph  is  obtained  by 
connecting  such  points  with  a  smooth  curve,  with  the  result  being  the  figure  shown  above. 
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it 


2 


Consider  another  example  of  constructing  a  polar  graph. 


* 


Solution.  The  graph  of  the  polar  equation  r  =  1  +  2  cos  0  for  0  G  [0,2 7i\  is  given  as  follows. 


To  see  the  way  this  is  graphed,  consider  the  following  picture.  First  the  indicated  points  were  graphed 
and  then  the  curve  was  drawn  to  connect  the  points.  When  done  by  a  computer,  many  more  points  are 
used  to  create  a  more  accurate  picture. 

Consider  first  the  following  table  of  points. 


0 

k/6 

7T/3 

7T/2 

5n/6 

71 

47T/3 

1k/6 

5k/3 

r 

V3  +  1 

2 

1 

1  —  y/3 

-1 

0 

1  —  -\/3 

2 

Note  how  some  entries  in  the  table  have  r  <  0.  To  graph  these  points,  simply  move  in  the  opposite 
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direction.  These  types  of  points  are  responsible  for  the  small  loop  on  the  inside  of  the  larger  loop  in  the 
graph. 


n 


2 


* 


The  process  of  constructing  these  graphs  can  be  greatly  facilitated  by  computer  software.  However, 
the  use  of  such  software  should  not  replace  understanding  the  steps  involved. 


The  next  example  shows  the  graph  for  the  equation  r 
computer  software  is  used  to  facilitate  the  process. 


3  + sin 


For  complicated  polar  graphs, 


Example  8.7:  A  Polar  Graph 

Graph  r  =  3  +  sin  ) 

f  1 0  \ 

—  ]  for  9  G  [0,  14k]. 

\  6  J 

Solution. 


* 


The  next  example  shows  another  situation  in  which  r  can  be  negative. 
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Solution. 


We  conclude  this  section  with  an  interesting  graph  of  a  simple  polar  equation. 


Solution.  The  graph  of  this  polar  equation  is  a  spiral.  This  is  the  case  because  as  0  increases,  so  does  r. 


In  the  next  section,  we  will  look  at  two  ways  of  generalizing  polar  coordinates  to  three  dimensions. 
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Exercises 


Exercise  8.1.1  In  the  following,  polar  coordinates  (r,6)  for  a  point  in  the  plane  are  given.  Find  the 
corresponding  Cartesian  coordinates. 

(a)  (2,  7t/4) 

(b)  (-2,n/4) 

(c)  (3,71/3) 

( d)  (-3,71/3) 

(e)  (2,5?r/6) 

(f)  (—2, 1 1  tt/6) 

(8)  (2,71/2) 

(h)  (\,3n/2) 

(i)  (-3,3n/4) 

(j)  (3,5k/ 4) 

(k)  (-2,71/6) 

Exercise  8.1.2  Consider  the  following  Cartesian  coordinates  (x,y).  Find  polar  coordinates  corresponding 
to  these  points. 

(a)  (-1,1) 

(b)  (x/3,-1) 

(c)  (0,2) 

(d)  (-5,0) 

(e)  (—2y/3, 2) 

(f)  (2,-2) 

(g)  ( — 1,  x/3) 

(h)  (-1.-V3) 

Exercise  8.1.3  The  following  relations  are  written  in  terms  of  Cartesian  coordinates  (x,y).  Rewrite  them 
in  terms  of  polar  coordinates,  (r,  0). 
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(a)  y  —  x2 

(b)  y  —  2x  +  6 

(c)  x2  +y 2  =  4 

(d)  x2-y2=l 


Exercise  8.1.4  Use  a  calculator  or  computer  algebra  system  to  graph  the  following  polar  relations. 

(a)  r  —  1  —  sin  (20) ,  0  £  [0, 27t] 

(b)  r  —  sin  (40) ,  0  £  [0,  2tt] 

(c)  r  =  cos  (30)  +  sin  (20) ,  0  £  [0, 2tt] 

(d)  r=  0,  0  e  [0,15] 


Exercise  8.1.5  Graph  the  polar  equation  r  —  1  +  sin  0  for  0  £  [0, 2 n\. 

Exercise  8.1.6  Graph  the  polar  equation  r  —  2  +  sin  0  for  0  £  [0,  2k]. 

Exercise  8.1.7  Graph  the  polar  equation  r  —  1  +  2  sin  0  for  0  £  [0,  2k]. 

Exercise  8.1.8  Graph  the  polar  equation  r  —  2  +  sin  (20)  for  0  £  [0,  2k]. 

Exercise  8.1.9  Graph  the  polar  equation  r  =  1  +  sin  (20)  for  0  £  [0,  2k]. 

Exercise  8.1.10  Graph  the  polar  equation  r  =  1  +  sin  (3  0)  for  0  £  [0,  2k]. 

Exercise  8.1.11  Describe  how  to  solve  for  r  and  0  in  terms  ofx  and  y  in  polar  coordinates. 

Exercise  8.1.12  This  problem  deals  with  parabolas,  ellipses,  and  hyperbolas  and  their  equations.  Let 
/,  e  >  0  and  consider 

I 

r  = - 

1  ±ecos0 

Show  that  ife  =  0,  the  graph  of  this  equation  gives  a  circle.  Show  that  if  0  <  e  <  l,  the  graph  is  an  ellipse, 
ife—l  it  is  a  parabola  and  if  e  >  l,  it  is  a  hyperbola. 


8.2.  Spherical  and  Cylindrical  Coordinates  449 


8.2  Spherical  and  Cylindrical  Coordinates 


Outcomes 


A.  Understand  cylindrical  and  spherical  coordinates. 

B.  Convert  points  between  Cartesian,  cylindrical,  and  spherical  coordinates. 


Spherical  and  cylindrical  coordinates  are  two  generalizations  of  polar  coordinates  to  three  dimensions. 
We  will  first  look  at  cylindrical  coordinates  . 

When  moving  from  polar  coordinates  in  two  dimensions  to  cylindrical  coordinates  in  three  dimensions, 
we  use  the  polar  coordinates  in  the  xy  plane  and  add  a  z  coordinate.  For  this  reason,  we  use  the  notation 
(r,0,z)  to  express  cylindrical  coordinates.  The  relationship  between  Cartesian  coordinates  (x,_y,z)  and 
cylindrical  coordinates  (r,  6,z)  is  given  by 

x  =  rcos(0) 
y  =  r  sin(0) 
z  —  z 

where  r  >  0,  G  e  |0,  2k),  and  z  is  simply  the  Cartesian  coordinate.  Notice  that  x  and  y  are  defined  as  the 
usual  polar  coordinates  in  the  xy-plane.  Recall  that  r  is  defined  as  the  length  of  the  ray  from  the  origin  to 
the  point  (x,_y,0),  while  0  is  the  angle  between  the  positive  x-axis  and  this  same  ray. 

To  illustrate  this  coordinate  system,  consider  the  following  two  pictures.  In  the  first  of  these,  both  r 
and  z  are  known.  The  cylinder  corresponds  to  a  given  value  for  r.  A  useful  way  to  think  of  r  is  as  the 
distance  between  a  point  in  three  dimensions  and  the  z-axis.  Every  point  on  the  cylinder  shown  is  at  the 
same  distance  from  the  z-axis.  Giving  a  value  for  z  results  in  a  horizontal  circle,  or  cross  section  of  the 
cylinder  at  the  given  height  on  the  z  axis  (shown  below  as  a  black  line  on  the  cylinder).  In  the  second 
picture,  the  point  is  specified  completely  by  also  knowing  0  as  shown. 


r  and  z  are  known 


r,  0  and  z  are  known 


Every  point  of  three  dimensional  space  other  than  the  z  axis  has  unique  cylindrical  coordinates.  Of 
course  there  are  infinitely  many  cylindrical  coordinates  for  the  origin  and  for  the  z-axis.  Any  6  will  work 
if  r  —  0  and  z  is  given. 
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Consider  now  spherical  coordinates,  the  second  generalization  of  polar  form  in  three  dimensions.  For 
a  point  (x,y,z)  in  three  dimensional  space,  the  spherical  coordinates  are  defined  as  follows. 

p  :  the  length  of  the  ray  from  the  origin  to  the  point 

0  :  the  angle  between  the  positive  x-axis  and  the  ray  from  the  origin  to  the  point  (x,y,0) 

0  :  the  angle  between  the  positive  z-axis  and  the  ray  from  the  origin  to  the  point  of  interest 

The  spherical  coordinates  are  determined  by  (p,(j),0).  The  relation  between  these  and  the  Cartesian  coor¬ 
dinates  (x,y,z)  for  a  point  are  as  follows. 

x  —  psin(0)cos(0),  (j)  G  [0, tt] 
y  —  p  sin  (0)  sin(0) ,  0  G  [0, 2tt) 
z  =  p  cos0,  p  >  0. 

Consider  the  pictures  below.  The  first  illustrates  the  surface  when  p  is  known,  which  is  a  sphere  of 
radius  p.  The  second  picture  corresponds  to  knowing  both  p  and  (j),  which  results  in  a  circle  about  the 
z-axis.  Suppose  the  first  picture  demonstrates  a  graph  of  the  Earth.  Then  the  circle  in  the  second  picture 
would  correspond  to  a  particular  latitude. 


p  is  known  p  and  (j)  are  known 


Giving  the  third  coordinate,  0  completely  specifies  the  point  of  interest.  This  is  demonstrated  in  the 
following  picture.  If  the  latitude  corresponds  to  (j),  then  we  can  think  of  0  as  the  longitude. 


z 


p,  (j)  and  0  are  known 
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The  following  picture  summarizes  the  geometric  meaning  of  the  three  coordinate  systems. 


z 


Therefore,  we  can  represent  the  same  point  in  three  ways,  using  Cartesian  coordinates,  (x,y,z),  cylin¬ 
drical  coordinates,  (r,  0,z),  and  spherical  coordinates  (p,<j>,  9). 

Using  this  picture  to  review,  call  the  point  of  interest  P  for  convenience.  The  Cartesian  coordinates  for 
P  are  (x,y,z).  Then  p  is  the  distance  between  the  origin  and  the  point  P.  The  angle  between  the  positive 
z  axis  and  the  line  between  the  origin  and  P  is  denoted  by  (j) .  Then  0  is  the  angle  between  the  positive 
x  axis  and  the  line  joining  the  origin  to  the  point  (jc,y,0)  as  shown.  This  gives  the  spherical  coordinates, 
(p,<j>,0).  Given  the  line  from  the  origin  to  (x,y,0),  r  —  p  sin(<j>)  is  the  length  of  this  line.  Thus  r  and 
0  determine  a  point  in  the  xy-plane.  In  other  words,  r  and  9  are  the  usual  polar  coordinates  and  r  >0 
and  9  G  [0,2 k).  Letting  z  denote  the  usual  z  coordinate  of  a  point  in  three  dimensions,  ( r,9,z )  are  the 
cylindrical  coordinates  of  P. 

The  relation  between  spherical  and  cylindrical  coordinates  is  that  r  —  p  sin(0)  and  the  9  is  the  same 
as  the  9  of  cylindrical  and  polar  coordinates. 

We  will  now  consider  some  examples. 


Example  8.10:  Describing  a  Surface  in  Spherical  Coordinates 


Express  the  surface  z  —  \Jx2  +  y2  in  spherical  coordinates. 


Solution.  We  will  use  the  equations  from  above: 

x  =  psin(<j>)cos(0),<j>  G  [0,7r] 
y  —  p  sin  (0)  sin(0) ,  9  G  [0,2tt) 
z  =  pcostj),  p  >  0 

To  express  the  surface  in  spherical  coordinates,  we  substitute  these  expressions  into  the  equation.  This 
is  done  as  follows: 

PCOSW  =  -L^(psillW)cos(e))2+((,sinW)sin(9))2=ly3psinW). 


This  reduces  to 


tan  (0)  =  y/3 
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and  so  (j)  —  n/3. 


* 


Example  8.11:  Describing  a  Surface  in  Spherical  Coordinates 


Express  the  surface  y  —  x  in  terms  of  spherical  coordinates. 


Solution.  Using  the  same  procedure  as  the  previous  example,  this  says  p  sin  (0)  sin  (0)  =  p  sin  (0)  cos  (0). 
Simplifying,  sin(0)  =  cos(0),  which  you  could  also  write  tan(0)  =  1.  4k 

We  conclude  this  section  with  an  example  of  how  to  describe  a  surface  using  cylindrical  coordinates. 


Example  8.12:  Describing  a  Surface  in  Cylindrical  Coordinates 


Express  the  surface  x2  +y2  —  4  in  cylindrical  coordinates. 


Solution.  Recall  that  to  convert  from  Cartesian  to  cylindrical  coordinates,  we  can  use  the  following  equa¬ 
tions: 

x  —  rcos(0)  ,y  —  rsin(0)  ,z  =  z 

Substituting  these  equations  in  for  x,y,z  in  the  equation  for  the  surface,  we  have 

r2cos2  (0)  4-r2sin2  (0)  =  4 

This  can  be  written  as  r2(cos2  (0)  +  sin2  (0))  —4.  Recall  that  cos2  (0)  +  sin2  (0)  =  1.  Thus  r2  —  4  or 
r  —  2. 

Exercises 


Exercise  8.2.1  The  following  are  the  cylindrical  coordinates  of  points,  ( r,0,z ).  Find  the  Cartesian  and 
spherical  coordinates  of  each  point. 

(a)  (5,f,-3) 

(b)  (3,f,4) 

(c)  (4,¥,1) 

(d)  (2,f,-2) 

(e)  (3,f,-l) 

(f)  (8,^,-ll) 
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Exercise  8.2.2  The  following  are  the  Cartesian  coordinates  of  points,  (. x,y,z ).  Find  the  cylindrical  and 
spherical  coordinates  of  these  points. 

(a)  (§V2,§V2,^3) 

(b)  (|,|V3,2) 

fcj  (-§V2,§V2,ll) 

(d)  (-|,|x/3,23) 

W  ( — x/3 ,  —  1 ,  —  5 ) 
f/j 

(g)  (y/2,y/6,2y/2) 

(h)  (4V3,§,1) 

(i)  (-J^,|^,-|V3) 

(j)  (~V3,l,2V3) 

(k) 

Exercise  8.2.3  The  following  are  spherical  coordinates  of  points  in  the  form  (p,0, 0).  Find  the  Cartesian 
and  cylindrical  coordinates  of  each  point. 

(“)  (44>f) 

(b) 

W  (4,f4) 

(/) 

Exercise  8.2.4  Describe  the  surface  (j)  =  k/A  in  Cartesian  coordinates,  where  (j)  is  the  polar  angle  in 
spherical  coordinates. 


Exercise  8.2.5  Describe  the  surface  0  —  n/ 4  in  spherical  coordinates,  where  0  is  /7?e  angle  measured 
from  the  positive  x  axis. 
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Exercise  8.2.6  Describe  the  surface  r  =  5  in  Cartesian  coordinates,  where  r  is  one  of  the  cylindrical 
coordinates. 

Exercise  8.2.7  Describe  the  surface  p  —  4  in  Cartesian  coordinates,  where  p  is  the  distance  to  the  origin. 

Exercise  8.2.8  Give  the  cone  described  by  z  =  y/x2  +y2  in  cylindrical  coordinates  and  in  spherical 
coordinates. 

Exercise  8.2.9  The  following  are  described  in  Cartesian  coordinates.  Rewrite  them  in  terms  of  spherical 
coordinates. 

(a)  z  =  x2+y2. 

(b)  x2-y2=l. 

( c )  z2+x2+y 2  =  6. 

(d)  z  =  sjx2  +y2. 

(e)  y  =  x. 

(f)  Z  —  X. 

Exercise  8.2.10  The  following  are  described  in  Cartesian  coordinates.  Rewrite  them  in  terms  of  cylindri¬ 
cal  coordinates. 

(a)  z  =  x2+y2. 

(b)  x2-y2  =  1. 

(c)  z2+x2+y 2  =  6. 

(d)  z  =  sjx2  +y2. 

(e)  y  =  x. 

(f)  z  =  x. 
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9.1  Algebraic  Considerations 


Outcomes 


A.  Develop  the  abstract  concept  of  a  vector  space  through  axioms. 

B.  Deduce  basic  properties  of  vector  spaces. 

C.  Use  the  vector  space  axioms  to  determine  if  a  set  and  its  operations  constitute  a  vector  space. 


In  this  section  we  consider  the  idea  of  an  abstract  vector  space.  A  vector  space  is  something  which  has 
two  operations  satisfying  the  following  vector  space  axioms. 


Definition  9.1:  Vector  Space 


A  vector  space  V  is  a  set  of  vectors  with  two  operations  defined,  addition  and  scalar  multiplication, 
which  satisfy  the  axioms  of  addition  and  scalar  multiplication. 


In  the  following  definition  we  define  two  operations;  vector  addition,  denoted  by  +  and  scalar  multipli¬ 
cation  denoted  by  placing  the  scalar  next  to  the  vector.  A  vector  space  need  not  have  usual  operations,  and 
for  this  reason  the  operations  will  always  be  given  in  the  definition  of  the  vector  space.  The  below  axioms 
for  addition  (written  +)  and  scalar  multiplication  must  hold  for  however  addition  and  scalar  multiplication 
are  defined  for  the  vector  space. 

It  is  important  to  note  that  we  have  seen  much  of  this  content  before,  in  terms  of  R".  We  will  prove  in 
this  section  that  R"  is  an  example  of  a  vector  space  and  therefore  all  discussions  in  this  chapter  will  pertain 
to  R".  While  it  may  be  useful  to  consider  all  concepts  of  this  chapter  in  terms  of  R'\  it  is  also  important  to 
understand  that  these  concepts  apply  to  all  vector  spaces. 

In  the  following  definition,  we  will  choose  scalars  a,  b  to  be  real  numbers  and  are  thus  dealing  with 
real  vector  spaces.  However,  we  could  also  choose  scalars  which  are  complex  numbers.  In  this  case,  we 
would  call  the  vector  space  V  complex. 
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Definition  9.2:  Axioms  of  Addition 


Let  v,w,z  be  vectors  in  a  vector  space  V.  Then  they  satisfy  the  following  axioms  of  addition: 

•  Closed  under  Addition 

Ifv,  w  are  in  V,  then  v  +  w  is  also  in  V. 

•  The  Commutative  Law  of  Addition 

v  +  w  —  w  +  v 

•  The  Associative  Law  of  Addition 

(v  +  w)+z  =  v+(w  +  z) 

•  The  Existence  of  an  Additive  Identity 

v  +  0  =  v 

•  The  Existence  of  an  Additive  Inverse 

v+  (-v)  =  6 


Definition  9.3:  Axioms  of  Scalar  Multiplication 


Let  a,b  el  and  let  v,  w,  z  be  vectors  in  a  vector  space  V .  Then  they  satisfy  the  following  axioms  of 
scalar  multiplication: 

•  Closed  under  Scalar  Multiplication 

If  a  is  a  real  number,  and  v  is  in  V,  then  av  is  in  V. 


a  (v  +  w)  —  av  +  aw 
(a  +  b)v  —  av  +  bv 
a  (bv)  —  ( ab)v 

lv  =  v 


Consider  the  following  example,  in  which  we  prove  that  R"  is  in  fact  a  vector  space. 


9.1.  Algebraic  Considerations  457 


Example  9.4:  W1 


Rn,  under  the  usual  operations  of  vector  addition  and  scalar  multiplication,  is  a  vector  space. 


Solution.  To  show  that  W1  is  a  vector  space,  we  need  to  show  that  the  above  axioms  hold.  Let  x,y,z  be 
vectors  in  RM.  We  first  prove  the  axioms  for  vector  addition. 


•  To  show  that  R”  is  closed  under  addition,  we  must  show  that  for  two  vectors  in  R"  their  sum  is  also 
in  R”.  The  sum  x  +  y  is  given  by: 


X\ 

yi 

X\  +V1 

x2 

+ 

y2 

— 

x2+yi 

Xn 

.  y,‘ . 

_  Xn+yn  _ 

The  sum  is  a  vector  with  n  entries,  showing  that  it  is  in  R".  Hence  R”  is  closed  under  vector  addition. 
•  To  show  that  addition  is  commutative,  consider  the  following: 


Xl 

yi 

x  +  y  = 

X2 

+ 

>’2 

Xn 

.  . 

Xl  +yi 

= 

X2+y2 

_  Xn+yn  _ 

yi  +xi 

= 

y2+X2 

_  yn+xn  _ 

yi 

Xl 

= 

y2 

+ 

X2 

. yn . 

Xn 

=  y+x 


Hence  addition  of  vectors  in  R"  is  commutative. 

We  will  show  that  addition  of  vectors  in  R"  is  associative  in  a  similar  way. 


(x  +  y)+I  = 


/ 

X\ 

y\ 

\ 

Z\ 

X2 

+ 

y2 

+ 

Z2 

\ 

Xn 

. y n . 

) 

Zn 
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xi+yi 

z  1 

x2+y2 

+ 

Z2 

Xn+yn 

Zn 

Oi  +yi)+zi 
(■ *2+y2 )  +Z2 


(Xn+yn)  +Zn 

xi  +  Cvi  +zi) 

X2+{y2  +  Zl) 


Xn  +  {yn+Zn ) 


X\ 

X2 


+ 


>’l  +Zl 

y2+Z2 


X 


n 


yn  4"  Zn 


Xl 

/ 

>’l 

x2 

+ 

+ 

Xn 

V 

.  •>’»  . 

=  x+(y  +  z) 


z  1 

Z2 


Hence  addition  of  vectors  is  associative. 


•  Next,  we  show  the  existence  of  an  additive  identity.  Let  0  = 


0 

0 

0 


Xl 

'  0  ' 

X2 

0 

x  +  0  = 

Xn 

+ 

0 

xi  +0 
X2  +  0 

xn  +  0 

Xl 

X2 
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=  x 


Hence  the  zero  vector  0  is  an  additive  identity. 


•  Next,  we  prove  the  existence  of  an  additive  inverse.  Let  —  x  — 


~x\ 

~X2 


-Xn 


X\ 

-X\ 

x+(-x)  = 

X2 

+ 

~X2 

Xn 

Xn 

xi  - 

■x\ 

= 

x%  — 

■X2 

Xn  ~ 

'  xn 

0 

0 

0 

=  0 


Hence  —  x  is  an  additive  inverse. 

We  now  need  to  prove  the  axioms  related  to  scalar  multiplication.  Let  a,b  be  real  numbers  and  let  x,y 
be  vectors  in  R". 


•  We  first  show  that  R”  is  closed  under  scalar  multiplication.  To  do  so,  we  show  that  ax  is  also  a  vector 
with  n  entries. 


x\ 

ax  i 

ax  =  a 

X2 

— 

ax  2 

Xn 

axn 

The  vector  ax  is  again  a  vector  with  n  entries,  showing  that  R”  is  closed  under  scalar  multiplication. 


•  We  wish  to  show  that  a(x  +  y)  —  ax  +  ay. 


( 

x\ 

X\ 

\ 

X2 

+ 

X2 

V 

Xn 

.  Xn  . 

/ 

a(x  +  y ) 


a 
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xi+yi 

x2+yi 

_  xn+yn  _ 

a(xi+yi) 

a(x2+yi) 

a(x  n+yn) 

ax i +  ay  i 
ax  2  +  ay  2 

axn  +  ayn 


ax  i 

avi 

ax2 

+ 

«V2 

ax„ 

—  ax  +  ay 

•  Next,  we  wish  to  show  that  (a  +  b)x  =  ax  +  bx. 


(a  +  b)x 


0 a  +  b ) 


*1 

x2 


(a  +  b)x  i 
(a  +  Z?)x2 

(a  +  £>)x„ 

ax  |  +  /?x  i 
ax  2  +  /xi'2 


ax„  +  bxn 


axi 

&Xl 

ax  2 

+ 

^X2 

axn 

&x„ 

ax  +  Z?x 
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•  We  wish  to  show  that  a(bx)  —  ( ab)x . 


a(bx) 


( 

b 

\  L 


X\ 

*2 


( 


VL 


bx  i 
bx  2 

bxn 


a(bx i) 
a(bx 2) 


\ 

) 

\ 

) 


a(bxn ) 

(ab)x 1 
{ab)x 2 


(ab)x, 
(■ ab ) 

(■ ab)x 


xi 

X2 


•  Finally,  we  need  to  show  that  lx  =  x. 


\x 


X\ 

X2 

_  Xn 

\X] 

\X2 

1  Xn 
Xl 

x2 

Xn 


By  the  above  proofs,  it  is  clear  that  satisfies  the  vector  space  axioms.  Hence,  R"  is  a  vector  space 
under  the  usual  operations  of  vector  addition  and  scalar  multiplication.  4* 
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We  now  consider  some  examples  of  vector  spaces. 


Example  9.5:  Vector  Space  of  Polynomials 


Let  P2  be  the  set  of  all  polynomials  of  at  most  degree  2  as  well  as  the  zero  polynomial.  Define  ad¬ 
dition  to  be  the  standard  addition  of  polynomials,  and  scalar  multiplication  the  usual  multiplication 
of  a  polynomial  by  a  number.  Then  P2  is  a  vector  space. 


Solution.  We  can  write  P2  explicitly  as 

P2  =  { ci2X 2  +  a\x  +  ao|a;  G  K.  for  all  i } 

To  show  that  P2  is  a  vector  space,  we  verify  the  axioms.  Let  p(x) ,  q(x) ,  r(x)  be  polynomials  in  P2  and  let 
a, b,  c  be  real  numbers.  Write  p(x)  —  p2x2  +  p\x  +  po,  q(x)  —  q2x2  +  q\x  +  qo,  and  r(x)  =  r2X2  +  r\x  +  vq. 


•  We  first  prove  that  addition  of  polynomials  in  P2  is  closed.  For  two  polynomials  in  P2  we  need  to 
show  that  their  sum  is  also  a  polynomial  in  P2.  From  the  definition  of  P2,  a  polynomial  is  contained 
in  P2  if  it  is  of  degree  at  most  2  or  the  zero  polynomial. 

p(x)+q(x)  =  p2x2  +  p\x  +  po  +  q2X2  +  q\x  +  qo 
=  (P2+q2)x2 +  (pi+qi)x+(po  +  qo) 

The  sum  is  a  polynomial  of  degree  2  and  therefore  is  in  P2.  It  follows  that  P2  is  closed  under 
addition. 

•  We  need  to  show  that  addition  is  commutative,  that  is  p(x)  +  q(x)  =  q(x)  +p(x). 

p(x)  +  q(x)  =  p2x2  +  p\x  +  pq  +  q2x2  +  q\x  +  qo 

=  (P2  +  q2)x2 +  (p\+q\)x+(po  +  qo) 

=  {q2  +  P2)x2  +  (q\  +pi)x+  (qo  +  po) 

=  <?2-* *2  +  qix  +  qo  +  p2x2  +  pix  +  po 
=  q{x)+p(x) 


•  Next,  we  need  to  show  that  addition  is  associative.  That  is,  that  (p(x)  +q(x) )  +  r(x)  —  p( jc)  +  (q(x)  + 
r(x)). 


(p(x)  +  q(x))  +  r(x) 


(p2X2  +P1X  +  P0  +  q2x2  +  q\x  +  qo)  +  r2x2  +  r\x  +  ro 
(P2  +  q2)x2  +  (pi  +qi)x+(po-\-q0) +  r2x2 +  rix  +  r0 
(P2  +  <?2  +  r2)x2  +  (pi+qi+n)x+  (p0  +  q0  +  r0) 
P2X2  +Pix  +  p0-\-(q2  +  r2)x2  +  {q\  +  n  )x  +  (q0  +  r0) 
p2x2  +  pix  +  po+  (q2x2  +  q\x  +  qo  +  r2x2  +  r\x+  r0) 
P(x)  +  (q(x)  +  r(x)) 


•  Next,  we  must  prove  that  there  exists  an  additive  identity.  Let  0(jc)  =  Ox2  +  Ox  +  0. 

p(x)+0(x)  =  P2X2  +  pix  + po +  0x2 +  0x  +  0 
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—  (P2  +  0).r  +  (p\  +  0)x  +  (po  +  0) 

=  P2X2  +P\X  +  p0 

=  p(x) 

Hence  an  additive  identity  exists,  specifically  the  zero  polynomial. 

•  Next  we  must  prove  that  there  exists  an  additive  inverse.  Let  —p(x)  =  —p2x2  —  p\x  —  po  and  consider 
the  following: 

p(x)  +  (-p(x))  =  P2X2+PlX  +  p0+(-p2X2-piX-p0) 

=  {P2-P2)xZ  +  {pi-pi)x+(p0-p0) 

=  Ox2  +  Ox  +  0 
=  00) 

Hence  an  additive  inverse  —p(x)  exists  such  that  p(x)  +  (-p(x))  —  0(jc). 

We  now  need  to  verify  the  axioms  related  to  scalar  multiplication. 

•  First  we  prove  that  FS  is  closed  under  scalar  multiplication.  That  is,  we  show  that  ap(x)  is  also  a 
polynomial  of  degree  at  most  2. 

ap(x)  =  a  ( p2x 2  +  pix  +  po)  =  ap2x2  +ap\x-\-apo 

Therefore  P2  is  closed  under  scalar  multiplication. 

•  We  need  to  show  that  a(p(x)  +  q( x))  —  ap(x )  +  aq(x). 

a(p(x)  +  q(x))  =  a  [p2x2  +  pix  + po  +  q2x2 +  qix  +  qo) 

=  a({P2  +  q2)x2  +  {pi+qi)x+{po  +  q0)) 

=  a(p2  +  q2)x2  +  a(pi+qi)x  +  a{po  +  qQ) 

=  (ap2  +  aq2)x2  +  (ap\  +  aq\)x  +  (apo  +  aqo) 

=  ap2x2  +  ap\x  +  apo  +  aq2x2  +  aq\x  +  aqo 
=  ap(x)+aq(x ) 

•  Next  we  show  that  (a  +  b)p(x)  =  ap(x)  +  bp(x). 

(a  +  b)p(x)  =  (a  +  b)(p2x2  +  p\x  + po) 

=  (a  +  b)p2x2  +  (a  +  b)pix+  (a  +  b)p0 

9  9 

=  ap2x  +  ap\x  +  apo  +  bp2x  -\-bp\x-\-bpo 
—  ap(x)+bp(x) 


The  next  axiom  which  needs  to  be  verified  is  a(bp(x))  —  (ab)p(x). 
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a(bp(x)) 


a  (b  (p2x2  +  p\x  +  po)) 
a  (bp2X2  +  bpix  +  bpo) 
abp2x~  +  cibp\x  +  abpo 
(ab)  (p2x2  +pix  +  po) 

( ab)p(x ) 


•  Finally,  we  show  that  1  p(x)  —  p(x). 


1  p{x)  =  1  (p2X2+PlX  +  p0) 

=  lp2X2  +lpiX+lp0 

=  P2X2  +  pix  +  po 
=  P(X) 

Since  the  above  axioms  hold,  we  know  that  P2  as  described  above  is  a  vector  space. 
Another  important  example  of  a  vector  space  is  the  set  of  all  matrices  of  the  same  size. 


* 


Example  9.6:  Vector  Space  of  Matrices 


Let  M2, 3  be  the  set  of  all  2x3  matrices.  Using  the  usual  operations  of  matrix  addition  and  scalar 
multiplication,  show  that  M2, 3  is  a  vector  space. 


Solution.  Let  A,  B  be  2  x  3  matrices  in  M2, 3.  We  first  prove  the  axioms  for  addition. 

•  In  order  to  prove  that  M2, 3  is  closed  under  matrix  addition,  we  show  that  the  sum  A  +  B  is  in  M2, 3. 
This  means  showing  that  A  +  J9isa2x3  matrix. 

bn  b\2  £>13 
b2 1  £>22  b. 23 

«13  +£>13 
(323  +  £* *23 

You  can  see  that  the  sum  is  a  2  x  3  matrix,  so  it  is  in  M2, 3.  It  follows  that  M2, 3  is  closed  under  matrix 
addition. 

•  The  remaining  axioms  regarding  matrix  addition  follow  from  properties  of  matrix  addition.  There¬ 
fore  M2, 3  satisfies  the  axioms  of  matrix  addition. 

We  now  turn  our  attention  to  the  axioms  regarding  scalar  multiplication.  Let  A,B  be  matrices  in  M2, 3 
and  let  c  be  a  real  number. 


A  +  B  = 


+ 


a  11  fli2  «13 

(321  ^22  6(23 

011+fiii  a\2  +  bn 
021+^21  O22  +  £>22 
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•  We  first  show  that  M2, 3  is  closed  under  scalar  multiplication.  That  is,  we  show  that  cA  a  2  x  3  matrix. 


a  11 

a  12 

a  13 

_  «21 

a  22 

a23 

can 

can 

cai3 

ca2 1 

ca22 

ca2  3 

This  is  a  2  x  3  matrix  in  M2, 3  which  proves  that  the  set  is  closed  under  scalar  multiplication. 

•  The  remaining  axioms  of  scalar  multiplication  follow  from  properties  of  scalar  multiplication  of 
matrices.  Therefore  M2, 3  satisfies  the  axioms  of  scalar  multiplication. 

In  conclusion,  M2, 3  satisfies  the  required  axioms  and  is  a  vector  space.  4 

While  here  we  proved  that  the  set  of  all  2  x  3  matrices  is  a  vector  space,  there  is  nothing  special  about 
this  choice  of  matrix  size.  In  fact  if  we  instead  consider  M,„,„,  the  set  of  all  m  x  n  matrices,  then  Mm,„  is  a 
vector  space  under  the  operations  of  matrix  addition  and  scalar  multiplication. 

We  now  examine  an  example  of  a  set  that  does  not  satisfy  all  of  the  above  axioms,  and  is  therefore  not 
a  vector  space. 


Example  9.7:  Not  a  Vector  Space 


Let  V  denote  the  set  of  2x3  matrices.  Let  addition  in  V  be  defined  by  A  +  B  =  A  for  matrices  A,  B 
in  V.  Let  scalar  multiplication  in  V  be  the  usual  scalar  multiplication  of  matrices.  Show  that  V  is 
not  a  vector  space. 


Solution.  In  order  to  show  that  V  is  not  a  vector  space,  it  suffices  to  find  only  one  axiom  which  is  not 
satisfied.  We  will  begin  by  examining  the  axioms  for  addition  until  one  is  found  which  does  not  hold.  Let 
A,B  be  matrices  in  V. 

•  We  first  want  to  check  if  addition  is  closed.  Consider  A+B.  By  the  definition  of  addition  in  the 
example,  we  have  that  A  +  B  =  A.  Since  A  is  a  2  x  3  matrix,  it  follows  that  the  sum  A  +  B  is  in  V, 
and  V  is  closed  under  addition. 

•  We  now  wish  to  check  if  addition  is  commutative.  That  is,  we  want  to  check  if  A+B  —  B  +  A  for 
all  choices  of  A  and  B  in  V.  From  the  definition  of  addition,  we  have  that  A+B  =  A  and  B  +A  —  B. 
Therefore,  we  can  find  A,  B  in  V  such  that  these  sums  are  not  equal.  One  example  is 

0  0  0' 

1  0  0 

Using  the  operation  defined  by  A  +  B  =  A,  we  have 

A  +  B  =  A 

'10  0' 

0  0  0 


A  = 


1  0  0 
0  0  0 


,B  = 
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B  +  A  =  B 

'0  0  0' 

1  0  0 

It  follows  that  A  +  B  ^  B  +  A.  Therefore  addition  as  defined  for  V  is  not  commutative  and  V  fails 
this  axiom.  Hence  V  is  not  a  vector  space. 


* 


Consider  another  example  of  a  vector  space. 


Example  9.8:  Vector  Space  of  Functions 


Let  S  be  a  nonempty  set  and  define  to  be  the  set  of  real  functions  defined  on  S.  In  other  words, 
we  write  F5  :  V 1 — >  R.  Letting  a,  b,  c  be  scalars  and  f ,  g,  h  functions,  the  vector  operations  are  defined 
as 

C f  +  g)(x )  =  f(x)+g(x) 

(af)(x)  =  «(/(* *)) 

Show  that  Fs  is  a  vector  space. 


Solution.  To  verify  that  F$  is  a  vector  space,  we  must  prove  the  axioms  beginning  with  those  for  addition. 
Let  f,g,h  be  functions  in  F$. 

•  First  we  check  that  addition  is  closed.  For  functions  f,g  defined  on  the  set  S,  their  sum  given  by 

(f  +  g)(x)  =  f(x)+g(x) 

is  again  a  function  defined  on  S.  Hence  this  sum  is  in  Fs  and  F5  is  closed  under  addition. 

•  Secondly,  we  check  the  commutative  law  of  addition: 

(f  +  g)  (x)  =  f(x)+g(x)  =  g(x)+f(x)  =  (g  +  f)  (x) 

Since  x  is  arbitrary,  /  +  g  —  g  +  /• 

•  Next  we  check  the  associative  law  of  addition: 

((f  +  g)  +h)  (x)  =  (f  +  g)  (x)  T  h  (x)  =  (/ (x)  +  g(x))  +  h  (x) 

=  /(*)  +  U  (x)  +  h  (x))  =  (f  (x)  +  (g  +  h )  (x))  =  (/  +  (g  +  h ))  (x) 
and  so  (f  +  g)  +  h  =  f+  (g  +  h) . 

•  Next  we  check  for  an  additive  identity.  Let  0  denote  the  function  which  is  given  by  0  (x)  =  0.  Then 
this  is  an  additive  identity  because 

(f  +  0)  (x)  =/(x)+0(x)  =/(x) 


and  so  f  +  0  —  f. 
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•  Finally,  check  for  an  additive  inverse.  Let  — /  be  the  function  which  satisfies  (— /)  (x)  =  —f(x) . 
Then 

(/+(-/))  0)  =  /(*)  +  (-/)  (*)  =  /(*)  +  -/(*)  =  0 

Hence /+(-/)  =0. 

Now,  check  the  axioms  for  scalar  multiplication. 

•  We  first  need  to  check  that  F$  is  closed  under  scalar  multiplication.  For  a  function  f(x)  in  Fs  and  real 
number  a,  the  function  (af)(x)  =  a(f(x))  is  again  a  function  defined  on  the  set  S.  Hence  a(f(x))  is 
in  Fs  and  F$  is  closed  under  scalar  multiplication. 


((a  +  b)  f )  (*)  =  (a  +  b)f  (x)  =  af  (x)  +  bf  (x)  =  ( af  +  bf)  (x) 
and  so  (a  +  b)f  —  af  +  bf. 

• 

(«(/  +  #))  0)  =  a(f  +  g)  (x)  =a(f(x)+g(x)) 

=  af  (x)  +  bg  (x)  =  (af  +  bg)  (x) 

and  so  a(f  +  g)  —af  +  bg. 

i(ab)f)  (x)  =  (ab)  f  (x)  =  a(bf  (x))  =  (a(bf))(x) 

so  (abf)  —  a  (bf). 

•  Finally  (1/)  (x)  =  l/(x)  =  f(x)  so  If  =  /. 

It  follows  that  V  satisfies  all  the  required  axioms  and  is  a  vector  space.  4k 

Consider  the  following  important  theorem. 


Proof. 
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1.  When  we  say  that  the  additive  identity,  0,  is  unique,  we  mean  that  if  a  vector  acts  like  the  additive 
identity,  then  it  is  the  additive  identity.  To  prove  this  uniqueness,  we  want  to  show  that  another 
vector  which  acts  like  the  additive  identity  is  actually  equal  to  0. 

Suppose  O'  is  also  an  additive  identity.  Then, 

0  +  0'  =  0 

Now,  for  0  the  additive  identity  given  above  in  the  axioms,  we  have  that 

0'  +  0  =  0' 


So  by  the  commutative  property: 

0  =  0  +  0'  =  0'  +  0  =  0' 


This  says  that  if  a  vector  acts  like  an  additive  identity  (such  as  O'),  it  in  fact  equals  0.  This  proves 
the  uniqueness  of  0. 

2.  When  we  say  that  the  additive  inverse,  — x,  is  unique,  we  mean  that  if  a  vector  acts  like  the  additive 
inverse,  then  it  is  the  additive  inverse.  Suppose  that  y  acts  like  an  additive  inverse: 

x  +  y  —  0 

Then  the  following  holds: 

y  —  0+y  =  (-x+x)  +y  —  —x  +  (x  +  y)  =  -x  +  0  =  — x 

Thus  if  y  acts  like  the  additive  inverse,  it  is  equal  to  the  additive  inverse  — x.  This  proves  the  unique¬ 
ness  of  —x. 

3.  This  statement  claims  that  for  all  vectors  x,  scalar  multiplication  by  0  equals  the  zero  vector  0. 
Consider  the  following,  using  the  fact  that  we  can  write  0  —  0  +  0: 

Ox  =  (0  +  0)  x  =  0x  +  Ox 

We  use  a  small  trick  here:  add  —Ox  to  both  sides.  This  gives 

Ox  +  ( — Ox)  =  Ox  +  Ox  +  ( — x) 

0  +  0  =  0x  +  0 
0  =  Ox 

This  proves  that  scalar  multiplication  of  any  vector  by  0  results  in  the  zero  vector  0. 

4.  Finally,  we  wish  to  show  that  scalar  multiplication  of  —1  and  any  vector  x  results  in  the  additive 
inverse  of  that  vector,  — x.  Recall  from  2.  above  that  the  additive  inverse  is  unique.  Consider  the 
following: 

(— l)x  +  x  =  (— l)x+lx 
=  ( — 1  +  l)x 
=  Ox 
=  0 

By  the  uniqueness  of  the  additive  inverse  shown  earlier,  any  vector  which  acts  like  the  additive 
inverse  must  be  equal  to  the  additive  inverse.  It  follows  that  (— l)x  =  — x. 
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* 


An  important  use  of  the  additive  inverse  is  the  following  theorem. 


Theorem  9.10: 


Let  V  be  a  vector  space.  Thenv  +  w  =  v  +  z  implies  thatw  —  z  for  all  v,w,zeV 


The  proof  follows  from  the  vector  space  axioms,  in  particular  the  existence  of  an  additive  inverse  (— u). 
The  proof  is  left  as  an  exercise  to  the  reader. 


Exercises 


Exercise  9.1.1  Suppose  you  have  R2  and  the  +  operation  is  as  follows: 

(a,b)  +  (c,d)  —  (a  +  d,b  +  c) . 

Scalar  multiplication  is  defined  in  the  usual  way.  Is  this  a  vector  space?  Explain  why  or  why  not. 

Exercise  9.1.2  Suppose  you  have  R2  and  the  +  operation  is  defined  as  follows. 

( a,b)  +  ( c,d )  =  (0  ,b  +  d) 

Scalar  multiplication  is  defined  in  the  usual  way.  Is  this  a  vector  space?  Explain  why  or  why  not. 

Exercise  9.1.3  Suppose  you  have  R2  and  scalar  multiplication  is  defined  as  c  (a,  b )  =  (a,  cb)  while  vector 
addition  is  defined  as  usual.  Is  this  a  vector  space?  Explain  why  or  why  not. 

Exercise  9.1.4  Suppose  you  have  R2  and  the  +  operation  is  defined  as  follows. 

(a,  b)  +  (c,d)  —  ( a  —  c,b  —  d ) 

Scalar  multiplication  is  same  as  usual.  Is  this  a  vector  space?  Explain  why  or  why  not. 

Exercise  9.1.5  Consider  all  the  functions  defined  on  a  non  empty  set  which  have  values  in  R.  Is  this  a 
vector  space?  Explain.  The  operations  are  defined  as  follows.  Here  f,g  signify  functions  and  a  is  a  scalar. 

(/  +  $)(*)  =  f(x)+g(x) 

(■ af){x )  =  a  (/  (x)) 


Exercise  9.1.6  Denote  by  RN  the  set  of  real  valued  sequences.  For  a  =  {an}°f=  1  ,b  =  {/?„}^L|  two  of  these, 
define  their  sum  to  be  given  by 

a  +  b  =  {an  +  bn  }n=  j 
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and  define  scalar  multiplication  by 

ca  =  {can}°fl=l  where  a  =  {an}™=  j 
Is  this  a  special  case  of  Problem  9.1.5?  Is  this  a  vector  space? 

Exercise  9.1.7  Let  C2  be  the  set  of  ordered  pairs  of  complex  numbers.  Define  addition  and  scalar  multi¬ 
plication  in  the  usual  way. 


(z,  w)  +  (z,  w)  —  (z  +  z,  w  +  w),  u  (z,  w)  =  ( uz ,  uw) 

Here  the  sccdars  are  from  C.  Show  this  is  a  vector  space. 

Exercise  9.1.8  Let  V  be  the  set  of  functions  defined  on  a  nonempty  set  which  have  values  in  a  vector  space 
W.  Is  this  a  vector  space?  Explain. 

Exercise  9.1.9  Consider  the  space  ofm  x  n  matrices  with  operation  of  addition  and  scalar  multiplication 
defined  the  usual  way.  That  is,  ifA,B  are  two  m  x  n  matrices  and  c  a  scalar, 

(A  +  B)ij  =  Ajj  +  Bjj,  (cA)ij  =  c  ( Aij ) 


Exercise  9.1.10  Consider  the  set  ofn  x  n  symmetric  matrices.  That  is,  A  —  A1.  In  other  words,  Ajj  —  Ajj. 
Show  that  this  set  of  symmetric  matrices  is  a  vector  space  and  a  subspace  of  the  vector  space  of  nx  n 
matrices. 

Exercise  9.1.11  Consider  the  set  of  all  vectors  in  M2,  (x,  y)  such  that  x  +  _y  >  0.  Let  the  vector  space 
operations  be  the  usual  ones.  Is  this  a  vector  space?  Is  it  a  subspace  o/'M2  ? 

Exercise  9.1.12  Consider  the  vectors  in  M2,  (x,  y)  such  that  xy  =  0.  Is  this  a  subspace  o/’M2?  Is  it  a  vector 
space?  The  addition  and  scalar  multiplication  are  the  usual  operations. 

Exercise  9.1.13  Define  the  operation  of  vector  addition  on  R2  by  (x,y)  +  (u,v)  —  (x  +  u,y  +  v  +  1) .  Let 
scalar  multiplication  be  the  usual  operation.  Is  this  a  vector  space  with  these  operations?  Explain. 

Exercise  9.1.14  Let  the  vectors  be  real  numbers.  Define  vector  space  operations  in  the  usual  way.  That 
is  x  +  y  means  to  add  the  two  numbers  and  xy  means  to  multiply  them.  Is  R  with  these  operations  a  vector 
space?  Explain. 

Exercise  9.1.15  Let  the  scalars  be  the  rational  numbers  and  let  the  vectors  be  real  numbers  which  are  the 
form  a  +  bs/lfor  a ,  b  rational  numbers.  Show  that  with  the  usual  operations,  this  is  a  vector  space. 

Exercise  9.1.16  Let  FS  be  the  set  of  all  polynomials  of  degree  2  or  less.  That  is,  these  are  of  the  form 
a  +  bx  +  cx2.  Addition  is  defined  as 

(a-\-bx-\-  cx2)  +  (a  +  bx  +  cx2)  =  (a  +  a)  +  (b  +  b)  x  +  (c  +  c)  x2 
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and  scalar  multiplication  is  defined  as 

d  [a +  bx  +  cx 2)  =  da  +  dbx  +  cdx2 

Show  that,  with  this  definition  of  the  vector  space  operations  that  P2  is  a  vector  space.  Now  let  V  denote 
those  polynomials  a  +  bx  +  cx2  such  that  a  +  b  +  c  —  0.  IsV  a  subspace  ofVfl  Explain. 

Exercise  9.1.17  Let  M,N  be  subspaces  of  a  vector  space  V  and  consider  M  +  N  defined  as  the  set  of  all 
m  +  n  where  m  G  M  and  n  G  N.  Show  that  M  +  N  is  a  subspace  ofV. 

Exercise  9.1.18  Let  M,N  be  subspaces  of  a  vector  space  V.  Then  MDN  consists  of  all  vectors  which  are 
in  both  M  and  N.  Show  that  MCN  is  a  subspace  ofV. 

Exercise  9.1.19  Let  M,N  be  subspaces  of  a  vector  space  M2.  Then  N  U  M  consists  of  all  vectors  which  are 
in  either  M  or  N.  Show  that  N  LJ  M  is  not  necessarily  a  subspace  of  R2  by  giving  an  example  where  NUM 
fails  to  be  a  subspace. 

Exercise  9.1.20  Let  X  consist  of  the  real  vcdued  functions  which  are  defined  on  an  interval  [a,b\ .  Lor 
f,g  G  X,  f  +  g  is  the  name  of  the  function  which  satisfies  ( f  +  g )  (x)  =  /(x)  +  g  (x).  Lor  s  a  real  number, 
(. sf )  (x)  —  s(f  (x)).  Show  this  is  a  vector  space. 

Exercise  9.1.21  Consider  functions  defined  on  {1,2,  ••  •  ,n }  having  values  in  M.  Explain  how,  ifV  is  the 
set  of  all  such  functions,  V  can  be  considered  as  M'\ 

Exercise  9.1.22  Let  the  vectors  be  polynomials  of  degree  no  more  than  3.  Show  that  with  the  usual 
definitions  of  scalar  multiplication  and  addition  wherein,  for  p  (x)  a  polynomial,  ( ap )  (x)  =  ap  (x)  and  for 
p,  q  polynomials  (p  +  q )  (x)  —  p(x)  +q  (x) ,  this  is  a  vector  space. 


9.2  Spanning  Sets 


In  this  section  we  will  examine  the  concept  of  spanning  introduced  earlier  in  terms  of  K".  Here,  we 
will  discuss  these  concepts  in  terms  of  abstract  vector  spaces. 

Consider  the  following  definition. 
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In  particular,  we  often  speak  of  subsets  of  a  vector  space,  such  as  X  C  V.  By  this  we  mean  that  every 
element  in  the  set  X  is  contained  in  the  vector  space  V. 


This  definition  leads  to  our  next  concept  of  span. 


r 1 

Definition  9.13:  Span  of  Vectors 

Let{vi,---  ,vn}  C  V.  Then 

span{v i,---  ,v„}  =  | 

f  n  1 

:  Ci  e  M 

l'=i  J 

l 

When  we  say  that  a  vector  w  is  in  span{vi,--- , vn }  we  mean  that  w  can  be  written  as  a  linear  com¬ 
bination  of  the  vi.  We  say  that  a  collection  of  vectors  {vi,---  ,vn}  is  a  spanning  set  for  V  if  V  = 
spanjvi,  •  •  •  ,vn}. 

Consider  the  following  example. 


Solution. 

First  consider  A.  We  want  to  see  if  scalars  s,t  can  be  found  such  that  A  =  sM\  -\-tM2. 


1  0 
0  2 


'  1 

0  ' 

'  0 

0  ' 

s 

0 

0 

+ 1 

0 

1 

The  solution  to  this  equation  is  given  by 


=  s 


and  it  follows  that  A  is  in  span  {Mi, M2}. 


1 

2 


t 
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Now  consider  B.  Again  we  write  B  =  sM\  +  tM^  and  see  if  a  solution  can  be  found  for  s,t. 


"or 

'  1 

0  ' 

0 

0 

1  0 

=  s 

0 

0 

+  t 

0  1 

Clearly  no  values  of  s  and  t  can  be  found  such  that  this  equation  holds.  Therefore  B  is  not  in  span  {Mi, M2}. 


Consider  another  example. 


Solution.  To  show  that  p(x)  is  in  the  given  span,  we  need  to  show  that  it  can  be  written  as  a  linear 
combination  of  polynomials  in  the  span.  Suppose  scalars  a,b  existed  such  that 

7x2  +  4x  —  3  =  a(4x2  +  x)  +  b(x 2  —  2x  +  3) 

If  this  linear  combination  were  to  hold,  the  following  would  be  true: 

4  a +  b  —  7 
a -2b  =  4 
3  b  =  -3 

You  can  verify  that  a  —  2,  b  =  —  1  satisfies  this  system  of  equations.  This  means  that  we  can  write  p (x) 
as  follows: 

7x2+4x  — 3  =  2(4x2+x)  —  (x2  — 2x  +  3) 

Hence  p(x)  is  in  the  given  span.  4 

Consider  the  following  example. 


Solution.  Let  p{x)  =  ax2  +  bx  +  c  be  an  arbitrary  polynomial  in  EV  To  show  that  S'  is  a  spanning  set,  it 
suffices  to  show  that  p{x)  can  be  written  as  a  linear  combination  of  the  elements  of  S.  In  other  words,  can 
we  find  r,s,t  such  that: 


p(x)  —  ax2  +  bx  +  c  —  r(x2  +  1)  +  s(x  —  2)  +t(2x2  —  x) 

If  a  solution  r,s,t  can  be  found,  then  this  shows  that  for  any  such  polynomial  p(x),  it  can  be  written  as 
a  linear  combination  of  the  above  polynomials  and  S  is  a  spanning  set. 
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ax2  +  bx  +  c  —  r(x2  +  1) +s(x— 2) +t(2x2  —  x) 

—  rx 2  +  r  +  sx  —  2s  +  2  tx2  —  tx 

—  (r  +  2t)x2  +  (s  —  t)x+  (r  —  2s) 

For  this  to  be  true,  the  following  must  hold: 

a  —  r  +  2t 
b  —  s  —  t 
c  —  r  —  2s 

To  check  that  a  solution  exists,  set  up  the  augmented  matrix  and  row  reduce: 


"  1 

0 

2 

a 

'  1 

0 

0 

2 d  +  2  b  + 

0 

1 

-1 

b 

— >■  • 

•  -> 

0 

1 

0 

\a~\c 

1 

-2 

0 

c 

_  0 

0 

1 

1 

-hM- 

■  1 

1 

'  <3 
— «ht 

Clearly  a  solution  exists  for  any  choice  of  a,  b,  c.  Hence  S  is  a  spanning  set  for  P2.  4 


Exercises 


Exercise  9.2.1  Let  V  be  a  vector  space  and  suppose  {xi ,  •  •  • ,  xk }  is  a  set  of  vectors  in  V.  Show >  that  0  is  in 
span{x i,---  ,xk}. 

Exercise  9.2.2  Determine  ifp(x)  —  4x2  —x  is  in  the  span  given  by 

span  [x2  +  x,x2  —  1,— x  +  2} 

Exercise  9.2.3  Determine  ifp(x)  —  —x2  +x  +  2  is  in  the  span  given  by 

span  {x2  +  x  +  1 , 2x2  +  x} 


Exercise  9.2.4  Determine  if  A 


1  3 
0  0 


is  in  the  span  given  by 


span 


1  0 
0  1 


0  1 
1  0 


1  0 
1  1 


0  1 
1  1 


Exercise  9.2.5  Show  that  the  spanning  set  in  Question  9.2.4  is  a  spanning  set  for  M22,  the  vector  space  of 
all  2x2  matrices. 
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9.3  Linear  Independence 


In  this  section,  we  will  again  explore  concepts  introduced  earlier  in  terms  of  M"  and  extend  them  to 
apply  to  abstract  vector  spaces. 


The  set  of  vectors  is  called  linearly  dependent  if  it  is  not  linearly  independent. 


Solution.  To  determine  if  this  set  S  is  linearly  independent,  we  write 

a(x2  +  2x  —  1)  +  b(  2x2  —  x  +  3)  =  Ox2  +  (be  +  0 

If  it  is  linearly  independent,  then  a  —  b  —  0  will  be  the  only  solution.  We  proceed  as  follows. 

a(x2  +  2x  —  1)  +  b(2x2  —  x  +  3)  =  (br  +  Ox  +  O 
ax 2  +  2  ax  —  a  +  2  bx2  —  bx  +  3b  —  Ox2  +  Ox  +  0 
(a  +  2b)x2  +  (2a  —  b)x  —  a  +  3b  —  (br  +  Ox  +  O 


It  follows  that 


a  +  2b 
2a  —  b 
—a  +  3  b 


0 

0 

0 
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The  augmented  matrix  and  resulting  reduced  row-echelon  form  are  given  by 


1  2 

0  ' 

'  1  0 

0  " 

2  -1 

0 

->■ - > 

0  1 

0 

-1  3 

0 

o 

o 

0 

Hence  the  solution  is  a  —  b  =  0  and  the  set  is  linearly  independent.  4k 

The  next  example  shows  us  what  it  means  for  a  set  to  be  dependent. 


Solution.  To  determine  if  S  is  linearly  independent,  we  look  for  solutions  to 


-1 

1 

1 

0 

a 

0 

+  b 

1 

T  c 

3 

= 

0 

1 

1 

5 

0 

Notice  that  this  equation  has  nontrivial  solutions,  for  example  a  —  2,  b  —  3  and  c  =  —  1.  Therefore  S  is 
dependent.  4k 

The  following  is  an  important  result  regarding  dependent  sets. 


Lemma  9.20:  Dependent  Sets 


Let  V  be  a  vector  space  and  suppose  W  =  {vi,V2>-  ■  ■ ,  iy}  is  a  subset  ofV.  Then  W  is  dependent  if 
and  only  if  Vj  can  be  written  as  a  linear  combination  of{v ■  ,v;_i,v;+i,  ■  ■  •  ,\’k}  for  some  i  <  k. 


Revisit  Example  9.19  with  this  in  mind.  Notice  that  we  can  write  one  of  the  three  vectors  as  a  combi¬ 
nation  of  the  others. 


"  1 ' 

"  -1  ' 

'  1  ' 

3 

=  2 

0 

+  3 

1 

5 

1 

1 

By  Lemma  9.20  this  set  is  dependent. 

If  we  know  that  one  particular  set  is  linearly  independent,  we  can  use  this  information  to  determine  if 
a  related  set  is  linearly  independent.  Consider  the  following  example. 


Example  9.21:  Related  Independent  Sets 


Let  V  be  a  vector  space  and  suppose  S  C  V  is  a  set  of  linearly  independent  vectors  given  by 
S  —  {u,v,w}.  LetRCV  be  given  by  R  —  {2u  —  w,w  +  v,3v  +  .  Show  that  R  is  also  linearly 

independent. 
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Solution.  To  determine  if  R  is  linearly  independent,  we  write 

a(2u  —  w)  +b(w  +  v)  +c(3v+  -u)  =  0 


If  the  set  is  linearly  independent,  the  only  solution  will  be  a  =  b  —  c 

a(2u  —  w)  +  b(w  +  v)  +c(3v+  -  u) 
2au  —  aw  +  bw  +  bv  +  3cv  +  -cu 
(2a  +  -c)u+  (b  +  3c)v  +  (—  a  +  b)w 


=  0.  We  proceed  as  follows. 

=  0 
=  0 
=  0 


We  know  that  the  set  S  =  {U,v,w}  is  linearly  independent,  which  implies  that  the  coefficients  in  the 
last  line  of  this  equation  must  all  equal  0.  In  other  words: 

2  a  +  ~c  —  0 

b  +  3c  =  0 
— a  +  b  —  0 


The  augmented  matrix  and  resulting  reduced  row-echelon  form  are  given  by: 


•  2  0  1 

0  ' 

'10  0 

0  ' 

0  1  3 

0 

->• - > 

0  1  0 

0 

.-110 

0  . 

0  0  1 

0 

Hence  the  solution  isa  =  &  =  c  =  0  and  the  set  is  linearly  independent.  4k 

The  following  theorem  was  discussed  in  terms  in  M'h  We  consider  it  here  in  the  general  case. 


Theorem  9.22:  Unique  Representation 


Let  V  be  a  vector  space  and  letU  =  {vi,  •  •  • , iy}  C  V  be  an  independent  set.  IfvE  span  U ,  then  v 
can  be  written  uniquely  as  a  linear  combination  of  the  vectors  in  U . 


Consider  the  span  of  a  linearly  independent  set  of  vectors.  Suppose  we  take  a  vector  which  is  not  in  this 
span  and  add  it  to  the  set.  The  following  lemma  claims  that  the  resulting  set  is  still  linearly  independent. 
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Proof.  Suppose  Yli=\  ci™i  +  dv  —  0.  It  is  required  to  verify  that  each  c,-  =  0  and  that  d  =  0.  But  if  d  ^  0, 
then  you  can  solve  for  v  as  a  linear  combination  of  the  vectors,  {u\,-  ■  ■ ,  u^}. 


contrary  to  the  assumption  that  v  is  not  in  the  span  of  the  u,.  Therefore,  d  —  0.  But  then  Ydi=\  c,m,-  =  0  and 
the  linear  independence  of  {mi,  •  •  • ,  Uk}  implies  each  c,  =  0  also.  4k 

Consider  the  following  example. 


Solution.  Instead  of  writing  a  linear  combination  of  the  matrices  which  equals  0  and  showing  that  the 
coefficients  must  equal  0,  we  can  instead  use  Lemma  9.23. 


To  do  so,  we  show  that 


Write 


0  0 
1  0 


rr  1  0 

\  [ 0  0 


0  1 
0  0 


0  0 
1  0 


1  0 


a  0 
0  0 

a  b 
0  0 


0  1 
0  0 

b 

0 


Clearly  there  are  no  possible  a,b  to  make  this  equation  true.  Hence  the  new  matrix  does  not  lie  in  the 
span  of  the  matrices  in  S.  By  Lemma  9.23,  R  is  also  linearly  independent.  4k 
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Exercises 


Exercise  9.3.1  Consider  the  vector  space  of  polynomials  of  degree  at  most  2,  FS-  Determine  whether  the 
following  is  a  basis  for  P2. 

}x  -|-x4  1, 2.x  -|-  2.v -1-  1 , x 4"  1 } 

Hint:  There  is  a  isomorphism  from  R3  to  P2.  It  is  defined  as  follows: 

Te  1  =  l,r<?2  —  x,Te 3  —  x2 

Then  extend  T  linearly.  Thus 


T 

'  1  ' 
1 

—  x2  +  x  +  1 ,  T 

"  1  ' 

2 

=  2x2  +  2x+  1,  T 

'  1  ' 
1 

1 

2 

0 

It  follows  that  if 


is  a  basis  for  M3 ,  then  the  polynomials  will  be  a  basis  for  P2  because  they  will  be  independent.  Recall 
that  an  isomorphism  takes  a  linearly  independent  set  to  a  linearly  independent  set.  Also,  since  T  is  an 
isomorphism,  it  preserves  all  linear  relations. 


Exercise  9.3.2  Find  a  basis  in  P2  for  the  subspace 

span  { 1  +  x  +  x2, 1  +  2x,  1  +  5x  —  3x2} 

If  the  above  three  vectors  do  not  yield  a  basis,  exhibit  one  of  them  as  a  linear  combination  of  the  others. 
Hint:  This  is  the  situation  in  which  you  have  a  spanning  set  and  you  want  to  cut  it  down  to  form  a 
linearly  independent  set  which  is  also  a  spanning  set.  Use  the  same  isomorphism  above.  Since  T  is  an 
isomorphism,  it  presents  all  linear  relations  so  if  such  can  be  found  in  M3,  the  same  linear  relations  will 
be  present  in  P2. 

Exercise  9.3.3  Find  a  basis  in  P3  for  the  subspace 

span  {  1  +  x  —  x2  +  .x3 , 1  +  2x  +  3x3,  —  1  +  3x  +  5x2  +  7x3, 1  +  6x +  4x2  + llx3} 

If  the  above  three  vectors  do  not  yield  a  basis,  exhibit  one  of  them  as  a  linear  combination  of  the  others. 


Exercise  9.3.4  Find  a  basis  in  P3  for  the  subspace 

span  {l  +  x  —  x2  +  x3, 1  +  2x  +  3x3,  —  1  +  3x  +  5x2  +  7x3, 1  +  6x  +  4x2  + llx3} 


If  the  above  three  vectors  do  not  yield  a  basis,  exhibit  one  of  them  as  a  linear  combination  of  the  others. 
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Exercise  9.3.5  Find  a  basis  in  IP 3  for  the  subspace 

span  {x3  —  2x2  +  x  +  2,3x3  —  x2  4-  2x+2,lx3  +x2  -f  4x  +  2,5x3  4-  3x  +  2} 

If  the  above  three  vectors  do  not  yield  a  basis,  exhibit  one  of  them  as  a  linear  combination  of  the  others. 

Exercise  9.3.6  Find  a  basis  in  IP 3  for  the  subspace 

span  { x3  +  2x2  4-  x  —  2, 3x3  4-  3.x2  +  2x  —  2, 3.x3  +  x  +  2, 3.x3  +  x  +  2  } 

If  the  above  three  vectors  do  not  yield  a  basis,  exhibit  one  of  them  as  a  linear  combination  of  the  others. 

Exercise  9.3.7  Find  a  basis  in  IP3  for  the  subspace 

span  |x3  —  5x2  4-x4-5,3x3  —4x2  +  2.x  +  5,5x3  +  8x2  +  2x—  5, 1  lx3  4- 6x4- 5} 

If  the  above  three  vectors  do  not  yield  a  basis,  exhibit  one  of  them  as  a  linear  combination  of  the  others. 

Exercise  9.3.8  Find  a  basis  in  IP3  for  the  subspace 

span  {x3  —  3x2  +  x  4-  3, 3x3  —  2x2  4-  2x  4-  3,1  x3  4-  lx2  +  3x  —  3,  lx3  +  4x  +  3 } 

If  the  above  three  vectors  do  not  yield  a  basis,  exhibit  one  of  them  as  a  linear  combination  of  the  others. 

Exercise  9.3.9  Find  a  basis  in  IP3  for  the  subspace 

span  jx3  —  x2  +  x+  l,3x3  4-  2x4-  l,4x3  4-x2  4-  2x4-  l,3x3  4-  2x  —  1} 

If  the  above  three  vectors  do  not  yield  a  basis,  exhibit  one  of  them  as  a  linear  combination  of  the  others. 

Exercise  9.3.10  Find  a  basis  in  IP3  for  the  subspace 

span{x3  —x2  4-x  4-  l,3x3  4-2x4-  1, 13x3  4-x2  +  8x  +  4,3x3  +2x—  1} 

If  the  above  three  vectors  do  not  yield  a  basis,  exhibit  one  of  them  as  a  linear  combination  of  the  others. 

Exercise  9.3.11  Find  a  basis  in  IP3  for  the  subspace 

span  {x3  —  3x2  +  x  +  3, 3x3  —  2x2  +  2x  +  3,  — 5x3  +  5x2  —  4x  —  6,  lx3  +  4x  —  3  } 

If  the  above  three  vectors  do  not  yield  a  basis,  exhibit  one  of  them  as  a  linear  combination  of  the  others. 

Exercise  9.3.12  Find  a  basis  in  IP3  for  the  subspace 

span{x 3  —  2.x 2  +x  +  2,3x3  —  x2  +  2x  +  2,7x3  —  x2  4-4x  +  4,5x3  +  3x  —  2} 

If  the  above  three  vectors  do  not  yield  a  basis,  exhibit  one  of  them  as  a  linear  combination  of  the  others. 

Exercise  9.3.13  Find  a  basis  in  IP3  for  the  subspace 

span  {x3  —  2x2  +  x  +  2, 3x3  —  x2  +  2x  +  2, 3x3  +  4x2  +  x  —  2,  lx3  —  x2  +  4x  +  4} 
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If  the  above  three  vectors  do  not  yield  a  basis,  exhibit  one  of  them  as  a  linear  combination  of  the  others. 

Exercise  9.3.14  Find  a  basis  in  P3  for  the  subspace 

span  {a3  —  4a2  +x  +  4, 3a3  —  3a2  +  2a  +  4,  —3a3  +  3a2  —  2a  —  4,  —2a3  +  4a2  —  2a  —  4} 

If  the  above  three  vectors  do  not  yield  a  basis,  exhibit  one  of  them  as  a  linear  combination  of  the  others. 

Exercise  9.3.15  Find  a  basis  in  P3  for  the  subspace 

span  {a3  +  2a2  +  a  —  2,  3a3  +  3a2  +  2a  —  2,  5a3  +  a2  +  2a  +  2,  10a3  +  10a2  +  6a  —  6} 

If  the  above  three  vectors  do  not  yield  a  basis,  exhibit  one  of  them  as  a  linear  combination  of  the  others. 

Exercise  9.3.16  Find  a  basis  in  P3  for  the  subspace 

span  {a3  +a2  +a—  1,3a3  +  2a2  +  2a—  1,a3  +  1,4a3  +  3a2  +  2a  —  l} 

If  the  above  three  vectors  do  not  yield  a  basis,  exhibit  one  of  them  as  a  linear  combination  of  the  others. 

Exercise  9.3.17  Find  a  basis  in  P3  for  the  subspace 

span  {a3  —  a2  +a  +  1,3a3  +  2a+  1,a3  +2a2  —  1,4a3  +  a2  +  2a  +  1} 

If  the  above  three  vectors  do  not  yield  a  basis,  exhibit  one  of  them  as  a  linear  combination  of  the  others. 

Exercise  9.3.18  Here  are  some  vectors. 

{a3  +a2  —a—  1,3a3  +  2a2  +  2a—  l} 

If  these  are  linearly  independent,  extend  to  a  basis  for  all  ofF  3. 

Exercise  9.3.19  Here  are  some  vectors. 

{a3  —  2a2  —  a  +  2,3a3  —  a2  +  2a  +  2  j 
If  these  are  linearly  independent,  extend  to  a  basis  for  all  ofF  3. 

Exercise  9.3.20  Here  are  some  vectors. 

{a3  —  3a2  —  a  +  3,3a3  —  2a2  +  2a  +  3} 

If  these  are  linearly  independent,  extend  to  a  basis  for  all  of  P3. 

Exercise  9.3.21  Here  are  some  vectors. 

{a3  —  2a2  —  3a  +  2,  3a3  —  a2  —  6a  +  2,  —8a3  +  1  8a  +  10} 

If  these  are  linearly  independent,  extend  to  a  basis  for  all  of  P3. 
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Exercise  9.3.22  Here  are  some  vectors. 

{a3  —  3a2  —  3a  +  3,3a3  —  2a2  —  6a  +  3,  —8a3  +  18a +  40} 

If  these  are  linearly  independent,  extend  to  a  basis  for  all  o/P  3. 

Exercise  9.3.23  Here  are  some  vectors. 

}  A-3  —  a  +  a  +  1 , 3a  +  2a  +  1 , 4a  +  2a  +  2  } 

If  these  are  linearly  independent,  extend  to  a  basis  for  all  of  P3. 

Exercise  9.3.24  Here  are  some  vectors. 

{a3  +a2  +  2a—  1,3a3  +2a2  +  4a—  1,7a3  +  8a  +  23} 

If  these  are  linearly  independent,  extend  to  a  basis  for  all  of  P3. 

Exercise  9.3.25  Determine  if  the  following  set  is  linearly  independent.  If  it  is  linearly  dependent,  write 
one  vector  as  a  linear  combination  of  the  other  vectors  in  the  set. 

{a+  1,a2  +  2,  a2  —  a  —  3} 

Exercise  9.3.26  Determine  if  the  following  set  is  linearly  independent.  If  it  is  linearly  dependent,  write 
one  vector  as  a  linear  combination  of  the  other  vectors  in  the  set. 

{a2  +  a,  —2a2  —  4a  —  6,  2a  —  2  } 


Exercise  9.3.27  Determine  if  the  following  set  is  linearly  independent.  If  it  is  linearly  dependent,  write 
one  vector  as  a  linear  combination  of  the  other  vectors  in  the  set. 


1 

0 


0 

2 


Exercise  9.3.28  Determine  if  the  following  set  is  linearly  independent.  If  it  is  linearly  dependent,  write 
one  vector  as  a  linear  combination  of  the  other  vectors  in  the  set. 


1  0 
0  1 


0  1 
0  1 


1  0 
1  0 


0  0 
1  1 


Exercise  9.3.29  If  you  have  5  vectors  in  R5  and  the  vectors  are  linearly  independent,  can  it  always  be 
concluded  they  span  R5? 

Exercise  9.3.30  If  you  have  6  vectors  in  R5,  is  it  possible  they  are  linearly  independent?  Explain. 

Exercise  9.3.31  Let  P3  be  the  polynomials  of  degree  no  more  than  3.  Determine  which  of  the  following 
are  bases  for  this  vector  space. 
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(a)  { x  +  l,x3  +  x2  +  2x,x2  +  x,x3  +  x2  +x} 

(b)  {x3  +  l,x2  +x,2x3  +x2,2x3  —x2  —  3x+  l} 


Exercise  9.3.32  In  the  context  of  the  above  problem ,  consider  polynomials 

{a;x3  +  biX2  +  C(-x  +  di,  i—  1,2, 3, 4} 

Show  that  this  collection  of  polynomials  is  linearly  independent  on  an  interi’al  [5,?]  if  and  only  if 


a\ 

b\ 

Cl 

d\ 

a2 

b2 

C2 

d2 

a3 

h 

C3 

d3 

«4 

b4 

C4 

J4 

is  an  invertible  matrix. 


Exercise  9.3.33  Let  the  field  of  scalars  be  Q,  the  rational  numbers  and  let  the  vectors  be  of  the  form 
a  +  by/  2  where  a,b  are  rational  numbers.  Show >  that  this  collection  of  vectors  is  a  vector  space  with  field 
of  scalars  Q  and  give  a  basis  for  this  vector  space. 


Exercise  9.3.34  Suppose  V  is  a  finite  dimensional  vector  space.  Based  on  the  exchange  theorem  above,  it 
was  shown  that  any  two  bases  have  the  same  number  of  vectors  in  them.  Give  a  different  proof  of  this  fact 
using  the  earlier  material  in  the  book.  Hint:  Suppose  {xi,--  ,xn}  and  {yi,  •  -  ,ym}  are  two  bases  with 
m  <  n.  Then  define 

0  :  M"  V,  y  :  Rm  ^  V 


by 


n 


0  («)  =  £  «***.  ¥ 

k=  1 


m 


E  bjyj 

7=1 


Consider  the  linear  transformation,  y/  1  o  (j).  Argue  it  is  a  one  to  one  and  onto  mapping  from  R'!  to  M"1. 
Now  consider  a  matrix  of  this  linear  transformation  and  its  reduced  row-echelon  form. 
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Outcomes 


A.  Utilize  the  subspace  test  to  determine  if  a  set  is  a  subspace  of  a  given  vector  space. 

B.  Extend  a  linearly  independent  set  and  shrink  a  spanning  set  to  a  basis  of  a  given  vector  space. 

In  this  section  we  will  examine  the  concept  of  subspaces  introduced  earlier  in  terms  of  Here,  we 
will  discuss  these  concepts  in  terms  of  abstract  vector  spaces. 
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Consider  the  definition  of  a  subspace. 


The  span  of  a  set  of  vectors  as  described  in  Definition  9.13  is  an  example  of  a  subspace.  The  following 
fundamental  result  says  that  subspaces  are  subsets  of  a  vector  space  which  are  themselves  vector  spaces. 


Theorem  9.26:  Subspaces  are  Vector  Spaces 


Let  W  be  a  nonempty  collection  of  vectors  in  a  vector  space  V.  Then  W  is  a  subspace  if  and  only  if 
W  satisfies  the  vector  space  axioms,  using  the  same  operations  as  those  defined  on  V. 


Proof.  Suppose  first  that  IT  is  a  subspace.  It  is  obvious  that  all  the  algebraic  laws  hold  on  W  because  it  is 
a  subset  of  V  and  they  hold  on  V.  Thus  u  +  v  =  v  +  U  along  with  the  other  axioms.  Does  W  contain  0?  Yes 
because  it  contains  0 u  =  0.  See  Theorem  9.9. 

Are  the  operations  of  V  defined  on  IT?  That  is,  when  you  add  vectors  of  IT  do  you  get  a  vector  in  IT? 
When  you  multiply  a  vector  in  IT  by  a  scalar,  do  you  get  a  vector  in  IT?  Yes.  This  is  contained  in  the 
definition.  Does  every  vector  in  IT  have  an  additive  inverse?  Yes  by  Theorem  9.9  because  —  v  —  (— 1)  v 
which  is  given  to  be  in  W  provided  v  G  W. 

Next  suppose  W  is  a  vector  space.  Then  by  definition,  it  is  closed  with  respect  to  linear  combinations. 
Hence  it  is  a  subspace.  4k 

Consider  the  following  useful  Corollary. 


Corollary  9.27 :  Span  is  a  Subspace 


Let  V  be  a  vector  space  with  W  C  V.  IfW  =  span  {vi ,  •  •  • ,  vn}  then  W  is  a  subspace  ofV. 


When  determining  spanning  sets  the  following  theorem  proves  useful. 


Theorem  9.28:  Spanning  Set 


LetW  C  V  fora  vector  spaceV  andsupposeW  —  span{v\,  V2, • • •  ,v„}. 

LetU  C  T  be  a  subspace  such  that  vi,  V2,  ■■■,?«£{/.  Then  it  follows  that  W  C  U. 


In  other  words,  this  theorem  claims  that  any  subspace  that  contains  a  set  of  vectors  must  also  contain 
the  span  of  these  vectors. 

The  following  example  will  show  that  two  spans,  described  differently,  can  in  fact  be  equal. 


Example  9.29:  Equal  Span 


Let  p(x),q(x)  be  polynomials  and  suppose  U  =  span{2p(x)  —  q(x),p(x) +  3q(x)}  and  W  = 
span{p(x),q(x)}.  Show  that  U  —  W. 
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Solution.  We  will  use  Theorem  9.28  to  show  that  U  C  W  and  W  C  (/.  It  will  then  follow  that  U  —  W. 

1.  ucw 

Notice  that  2 p(x)  —  q(x)  and  p(x)  +3q(x)  are  both  in  W  =  span{p(x),g(x)}.  Then  by  Theorem  9.28 
W  must  contain  the  span  of  these  polynomials  and  so  U  Cf. 

2.  WCU 
Notice  that 


p(x)  =  ^  (2 p(x)  -  q{x))  +  *  (p(x)  +  3 q{x)) 
q(x)  =  -^(2p(x)-q(x))  +  ^{p(x)  +  3q(x)) 

Hence  p(x),q(x)  are  in  span{2 p(x)  —  q(x),p(x)  +  3g(x)}.  By  Theorem  9.28  U  must  contain  the 
span  of  these  polynomials  and  so  W  C  [/. 


* 

To  prove  that  a  set  is  a  vector  space,  one  must  verify  each  of  the  axioms  given  in  Definition  9.2  and 
9.3.  This  is  a  cumbersome  task,  and  therefore  a  shorter  procedure  is  used  to  verify  a  subspace. 


Procedure  9.30:  Subspace  Test 


Suppose  W  is  a  subset  of  a  vector  space  V.  To  determine  ifW  is  a  subspace  ofV,  it  is  sufficient  to 
determine  if  the  following  three  conditions  hold,  using  the  operations  ofV: 

1.  The  additive  identity  0  ofV  is  contained  in  W. 

2.  For  any  vectors  w\,W2  in  W,  vv [  +  m  is  also  in  W. 

3.  For  any  vector  w i  in  W  and  scalar  a,  the  product  aw i  is  also  in  W. 


Therefore  it  suffices  to  prove  these  three  steps  to  show  that  a  set  is  a  subspace. 
Consider  the  following  example. 


Example  9.31:  Improper  Subspaces 

Let  V  be  an  arbitrary  vector  space.  Then  V  is  a  subspace  of  itself.  Similarly,  the  set  j 
only  the  zero  vector  is  also  a  subspace. 

lUJ 

>  containing 

Solution.  Using  the  subspace  test  in  Procedure  9.30  we  can  show  that  V  and  |o|  are  subspaces  of  V. 

Since  V  satisfies  the  vector  space  axioms  it  also  satisfies  the  three  steps  of  the  subspace  test.  Therefore 
V  is  a  subspace. 

Let’s  consider  the  set  <  0 
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1 .  The  vector  0  is  clearly  contained  in  <  0  > ,  so  the  first  condition  is  satisfied 


2.  Let  w\ ,  W2  be  in  <  0  k  Then  w\  —  0  and  w2  =  0  and  so 


W1+W2  —  0  +  0  —  0 


It  follows  that  the  sum  is  contained  in  0  >  and  the  second  condition  is  satisfied 


3.  Let  w\  be  in  |o|  and  let  a  be  an  arbitrary  scalar.  Then 


aw\  —  aO  —  0 


Hence  the  product  is  contained  in  <  0  >  and  the  third  condition  is  satisfied 


It  follows  that  jO  j  is  a  subspace  of  V.  4 

The  two  subspaces  described  above  are  called  improper  subspaces.  Any  subspace  of  a  vector  space 
V  which  is  not  equal  to  V  or  |o|  is  called  a  proper  subspace. 

Consider  another  example. 


Example  9.32:  Subspace  of  Polynomials 


Let  P2  be  the  vector  space  of  polynomials  of  degree  two  or  less.  Let  W  C  P2  be  all  polynomials  of 
degree  two  or  less  which  have  1  as  a  root.  Show  that  W  is  a  subspace  of  P2. 


Solution.  First,  express  IT  as  follows: 

IT  =  jp(x)  =  ax2  +  bx  +  c,a,b,c,E  R|/?(l)  =  0} 

We  need  to  show  that  IT  satisfies  the  three  conditions  of  Procedure  9.30. 

1 .  The  zero  polynomial  of  P2  is  given  by  0(x)  =  Ox2  +  Ox  +  0  =  0.  Clearly  0(  1 )  =  0  so  0(x)  is  contained 
in  IT. 

2.  Let  p(x),q(x)  be  polynomials  in  IT.  It  follows  that  p(l)  =  0  and  g(l)  =  0.  Now  consider  p(x)  +q(x). 
Let  r(x)  represent  this  sum. 

r(l)  =  p{l)+q{  1) 

=  0  +  0 

=  0 


Therefore  the  sum  is  also  in  W  and  the  second  condition  is  satisfied. 
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3.  Let  p(x)  be  a  polynomial  in  W  and  let  a  be  a  scalar.  It  follows  that  p(  1)  =  0.  Consider  the  product 
ap(x). 


ap(  1)  =  a(  0) 
=  0 


Therefore  the  product  is  in  W  and  the  third  condition  is  satisfied. 

It  follows  that  IT  is  a  subspace  of  P2.  4b 

Recall  the  definition  of  basis,  considered  now  in  the  context  of  vector  spaces. 


Definition  9.33:  Basis 


Let  V  be  a  vector  space.  Then  {vi,  •  •  • ,  vn}  is  called  a  basis  for  V  if  the  following  conditions  hold. 

1.  span{v ,vn}  =  V 

2.  {vi,  ■■■,?„}  is  linearly  independent 


Consider  the  following  example. 


Solution.  It  can  be  verified  that  P2  is  a  vector  space  defined  under  the  usual  addition  and  scalar  multipli¬ 
cation  of  polynomials. 

Now,  since  P2  =  span  {x2,x,  1 },  the  set  {x2,x,  1 }  is  a  basis  if  it  is  linearly  independent.  Suppose  then 

that 

ax2  +  bx  +  c  =  Ox 2  +  Ox  +  0 

where  a,  b,  c  are  real  numbers.  It  is  clear  that  this  can  only  occur  if  a  =  b  —  c  —  0.  Hence  the  set  is  linearly 
independent  and  forms  a  basis  of  P2.  4b 

The  next  theorem  is  an  essential  result  in  linear  algebra  and  is  called  the  exchange  theorem. 


Theorem  9.35:  Exchange  Theorem 


Let  {x\ ,  •  •  •  ,xr}  be  a  linearly  independent  set  of  vectors  such  that  each  %  is  contained  in 
span{y\,  ■  ■  ■  ,ys} .  Then  r<s. 


Proof.  The  proof  will  proceed  as  follows.  First,  we  set  up  the  necessary  steps  for  the  proof.  Next,  we  will 
assume  that  r  >  s  and  show  that  this  leads  to  a  contradiction,  thus  requiring  that  r  <  s. 
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Define  spanjyi,-  •  •  ,ys}  =  V.  Since  each  %  is  in  span{yi,-  •  •  ,3^},  it  follows  there  exist  scalars  c\,-  ■  ■  ,cs 
such  that 

X\  =  Y^ciyi  (9-1) 

1=1 

Note  that  not  all  of  these  scalars  c(-  can  equal  zero.  Suppose  that  all  the  ct  —  0.  Then  it  would  follow  that 
x\  —0  and  so  {x\ ,  •  •  •  ,xr}  would  not  be  linearly  independent.  Indeed,  if  x\  =  0,  \x\  +  YH=  2 =  x\  =  0 
and  so  there  would  exist  a  nontrivial  linear  combination  of  the  vectors  {x\,---  ,xr}  which  equals  zero. 
Therefore  at  least  one  c,  is  nonzero. 

Say  ck  /  0.  Then  solve  9.1  for  yk  and  obtain 


s-1  vectors  here 


yk  e  span  <  x\,y\,  ■  ■  ■  ,yk-i,yk+u  ■  ■  ■ ,% 


Define  {zi,  -  -  -  ,zs-i}  to  be 


{ft,--  -  ,z*-i}  =  {yi,---  Jk-iJk+i,"-  Js} 


Now  we  can  write 


yk  e  span{xi,zi, •  •  • 


Therefore,  span{j?i,zi, •  •  •  ,z^_i}  =  V.  To  see  this,  suppose  v  G  V.  Then  there  exist  constants  ci, •  •  •  ,cs 
such  that 

s—  1 

V  =  CiZi  +  csyk. 
i=  1 

Replace  this  yk  with  a  linear  combination  of  the  vectors  {3ci ,  z\ ,  •  ■  ■ ,  zs- 1 }  to  obtain  v  e  span  (jci ,  zi ,  •  •  ■ ,  zs- 1 }  • 
The  vector  yk,  in  the  list  {yi .  •  •  • , yv }  ,  has  now  been  replaced  with  the  vector  x\  and  the  resulting  modified 
list  of  vectors  has  the  same  span  as  the  original  list  of  vectors,  {  vi ,  •  •  •  ,ys} . 

We  are  now  ready  to  move  on  to  the  proof.  Suppose  that  r  >  s  and  that 


spanjxi ,■■■  ,xi,zi,---  ,zp}  =  V 


where  the  process  established  above  has  continued.  In  other  words,  the  vectors  zi,  ■  ■  ■  ,zp  are  each  taken 
from  the  set  {yi,  •  •  •  ,yiS}  and  1  +  p  —  s.  This  was  done  for  /  =  1  above.  Then  since  r  >  s ,  it  follows  that 
l  <s  <  r  and  so  /  +  1  <  r.  Therefore,  x/+ 1  is  a  vector  not  in  the  list,  {xj,  ■  •  •  ,x/}  and  since 

span  (xi,  ■  •  •  ,x/,zi,  ■  ■  •  ,zp}  =V 


there  exist  scalars,  q  and  dj  such  that 


1  p 

Xl+1  =  £  CiXi  +  Y,  djZj.  (9.2) 

*= 1  7=1 

Not  all  the  dj  can  equal  zero  because  if  this  were  so,  it  would  follow  that  {xi,  •  •  •  ,xr}  would  be  a  linearly 
dependent  set  because  one  of  the  vectors  would  equal  a  linear  combination  of  the  others.  Therefore,  9.2 
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can  be  solved  for  one  of  the  tu  say  Zk,  in  terms  of  3t/+i  and  the  other  Zi  and  just  as  in  the  above  argument, 
replace  that  ti  with  X[+\  to  obtain 


p- 1  vectors  here 


span  <  xu  -  ■  -xi,xi+i,zu  -  ■  ■  Zk-i,Zk+i,  -  ■  ■  ,zp 


V 


Continue  this  way,  eventually  obtaining 


spanjxi,---  ,xs}  =  V. 


But  then  xr  G  span  {xi ,  •  •  ■  ,xs}  contrary  to  the  assumption  that  {3q ,  •  •  •  ,xr}  is  linearly  independent.  There¬ 
fore,  r  <  s  as  claimed.  4|fc 

The  following  corollary  follows  from  the  exchange  theorem. 


Corollary  9.36:  Two  Bases  of  the  Same  Length 


Let  B i,  B 2  be  two  bases  of  a  vector  space  V.  Suppose  B\  contains  m  vectors  and  lh  contains  n 
vectors.  Then  m  =  n. 


Proof.  By  Theorem  9.35,  m  <  n  and  n  <  m.  Therefore  m  —  n.  A 

This  corollary  is  very  important  so  we  provide  another  proof  independent  of  the  exchange  theorem 
above. 

Proof.  Suppose  n  >  m.  Then  since  the  vectors  { Z7 1 ,  -  -  - ,  um)  span  V,  there  exist  scalars  c(/  such  that 

m 

E  C‘A  -  vj. 

i=  1 


Therefore, 


7=1 


dPj 


n  m 

0  if  and  only  if  EE^  = 

7=1  i=l 


0 


if  and  only  if 

m  /  n  \ 

E  (  E  cddi\  “f  =  0 

Now  since  {U\,  ■  ■  ■  ,un}  is  independent,  this  happens  if  and  only  if 


^  Cijdj  =  0,  i  =  1,2,  -  ••  ,m. 

7=1 

However,  this  is  a  system  of  m  equations  in  n  variables,  d\,  ■  ■  ■ , dn  and  m  <  n.  Therefore,  there  exists  a 
solution  to  this  system  of  equations  in  which  not  all  the  dj  are  equal  to  zero.  Recall  why  this  is  so.  The 
augmented  matrix  for  the  system  is  of  the  form  [  C  |  0  ]  where  C  is  a  matrix  which  has  more  columns 
than  rows.  Therefore,  there  are  free  variables  and  hence  nonzero  solutions  to  the  system  of  equations. 
However,  this  contradicts  the  linear  independence  of  {u\ ,  •  •  • ,  um}.  Similarly  it  cannot  happen  that  m  >  n. 

4 
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Given  the  result  of  the  previous  corollary,  the  following  definition  follows. 


Definition  9.37 :  Dimension 


A  vector  space  V  is  of  dimension  n  if  it  has  a  basis  consisting  of  n  vectors. 


Notice  that  the  dimension  is  well  defined  by  Corollary  9.36.  It  is  assumed  here  that  n<°°  and  therefore 
such  a  vector  space  is  said  to  be  finite  dimensional. 


Example  9.38:  Dimension  of  a  Vector  Space 


Let  P2  be  the  set  of  all  polynomials  of  degree  at  most  2.  Find  the  dimension  of  P2. 


Solution.  If  we  can  find  a  basis  of  P2  then  the  number  of  vectors  in  the  basis  will  give  the  dimension. 
Recall  from  Example  9.34  that  a  basis  of  P2  is  given  by 

S  —  {x2,x,  1} 

There  are  three  polynomials  in  S  and  hence  the  dimension  of  P2  is  three.  4k 

It  is  important  to  note  that  a  basis  for  a  vector  space  is  not  unique.  A  vector  space  can  have  many 
bases.  Consider  the  following  example. 


Example  9.39:  A  Different  Basis  for  Polynomials  of  Degree  Two 


Let  P2  be  the  polynomials  of  degree  no  more  than  2.  Is  {x2  +x  +  1 , 2x  +  1 , 3x2  +  1 }  a  basis  for  P2  ? 


Solution.  Suppose  these  vectors  are  linearly  independent  but  do  not  form  a  spanning  set  for  P2.  Then  by 
Lemma  9.23,  we  could  find  a  fourth  polynomial  in  P2  to  create  a  new  linearly  independent  set  containing 
four  polynomials.  However  this  would  imply  that  we  could  find  a  basis  of  P2  of  more  than  three  polyno¬ 
mials.  This  contradicts  the  result  of  Example  9.38  in  which  we  determined  the  dimension  of  P2  is  three. 
Therefore  if  these  vectors  are  linearly  independent  they  must  also  form  a  spanning  set  and  thus  a  basis  for 

P2- 

Suppose  then  that 

a  (x2  +x+  l)  -\-b  (2x  +  1)  +c  (3x2  T  l)  =  0 
(a  +  3c)x  T  (r?  T  2fi)  x  T  (r?  -I-  b  T  c)  —  0 

We  know  that  {x2,x,  1 }  is  linearly  independent,  and  so  it  follows  that 

a  +  3c  —  0 
a  +  2b  —  0 
a+b+c  =  0 


and  there  is  only  one  solution  to  this  system  of  equations,  a  —  b  —  c  —  0.  Therefore,  these  are  linearly 
independent  and  form  a  basis  for  P2.  4k 
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Consider  the  following  theorem. 


Theorem  9.40:  Every  Subspace  has  a  Basis 


Let  W  be  a  nonzero  subspace  of  a  finite  dimensional  vector  space  V.  Suppose  V  has  dimension  n. 
Then  W  has  a  basis  with  no  more  than  n  vectors. 


Proof.  Let  v\  e  V  where  v\  /  0.  If  span  {iq}  =  V,  then  it  follows  that  {iq  }  is  a  basis  for  V.  Otherwise,  there 
exists  V2  G  V  which  is  not  in  span{vi} .  By  Lemma  9.23  {iq, V2}  is  a  linearly  independent  set  of  vectors. 
Then  {vi.w?}  is  a  basis  for  V  and  we  are  done.  If  span{vi,V2}  /  V,  then  there  exists  V3  ^  span{vi,V2} 
and  {v! ,  v2,  V3}  is  a  larger  linearly  independent  set  of  vectors.  Continuing  this  way,  the  process  must  stop 
before  n+  1  steps  because  if  not,  it  would  be  possible  to  obtain  n  +  1  linearly  independent  vectors  contrary 
to  the  exchange  theorem,  Theorem  9.35.  4k 

If  in  fact  W  has  n  vectors,  then  it  follows  that  W  =  V. 


Theorem  9.41:  Subspace  of  Same  Dimension 


Let  V  be  a  vector  space  of  dimension  n  and  let  W  be  a  subspace.  Then  W  —  V  if  and  only  if  the 
dimension  ofW  is  also  n. 


Proof.  First  suppose  W  —  V.  Then  obviously  the  dimension  of  W  =  n. 

Now  suppose  that  the  dimension  of  IT  is  n.  Let  a  basis  for  W  be  {w  1,  •  ■■  ,wn}.  If  IT  is  not  equal  to 
V  ,  then  let  v  be  a  vector  of  V  which  is  not  contained  in  W.  Thus  v  is  not  in  span{hq,  ■  ■  ■  ,wn}  and  by 
Lemma  9.74,  {ivi,  •  •  •  ,w„,  v}  is  linearly  independent  which  contradicts  Theorem  9.35  because  it  would  be 
an  independent  set  of  n  +  1  vectors  even  though  each  of  these  vectors  is  in  a  spanning  set  of  n  vectors,  a 
basis  of  V.  4 

Consider  the  following  example. 


Example  9.42:  Basis  of  a  Subspace 

Let  U  —  |a  g  M22 
U,  and  hence  dim(£7 

A 

)• 

"  1  O' 
1  -1 

= 

'  1  1  ' 
0  -1 

A 

j 

j> .  Then  U  is  a  subspace  of  M22  Find  a  basis  of 

Solution.  Let  A 


a  b 
c  d 


G  M22.  Then 


and 


'  1  O' 

a  b 

'  1 

0  ' 

a-\-b 

-b  ' 

1  -1 

c  d 

1 

-1 

c  +  d 

-d 

'  1 

1 

'  1  1 

0 

-1 

A  = 

0  -1 

a  +  b 

-b  ' 

a  +  c 

b  +  d 

c  +  d 

-d 

— c 

-d 

a  b 
c  d 


a  +  c  b  +  d 
—c  —  d 


If  A  G  U,  then 
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Equating  entries  leads  to  a  system  of  four  equations  in  the  four  variables  a,b,c  and  d. 


a  +  b 
-b 
c  +  d 
-d 


a  +  c 
b  +  d 
—c 
-d 


b  —  c 

or  —2  b  —  d 

2  c  +  d 


0 

0 

0 


The  solution  to  this  system  is  a  —  s,  b  —  ~jt,c  —  ~jt,d  —  t  for  any  6  R,  and  thus 


s  ^ 

'  1 

0  ' 

0 

_1  - 

2 

-l  t 

L  2  1  J 

=  s 

0 

0 

+  t 

1 

2 

2 

1 

Let 


5  = 


1  0 
0  0 


0  4 

-i  i 


Then  span(5)  =  U ,  and  it  is  routine  to  verify  that  B  is  an  independent  subset  of  M22-  Therefore  B  is  a 
basis  of  U,  and  dim(£/)  =  2.  4 


The  following  theorem  claims  that  a  spanning  set  of  a  vector  space  V  can  be  shrunk  down  to  a  basis  of 
V.  Similarly,  a  linearly  independent  set  within  V  can  be  enlarged  to  create  a  basis  of  V. 


Theorem  9.43:  Basis  of  V 


IfV  —  span  {mi,---  ,  m„}  is  a  vector  space,  then  some  subset  of  {u\ ,  •  •  • ,  un}  is  a  basis  for  V.  Also, 
if  {mi,- •  •  ,Myt}  C  V  is  linearly  independent  and  the  vector  space  is  finite  dimensional,  then  the  set 
{mi,- •  • , Uk},  can  be  enlarged  to  obtain  a  basis  ofV. 


Proof.  Let 


S  —  {E  C  {«!,••• ,  m,;}  such  that  span{£’}  =  V}. 
Lor  E  6  S,  let  \E\  denote  the  number  of  elements  of  E.  Let 

m  —  min{|£’|  such  that  E  £  S}. 


Thus  there  exist  vectors 


{vi ,■■■  ,vm}  Q  {mi,--  -  ,un} 


such  that 


span{vi,  •  •  •  ,vm}  =  V 


and  m  is  as  small  as  possible  for  this  to  happen.  If  this  set  is  linearly  independent,  it  follows  it  is  a  basis 
for  V  and  the  theorem  is  proved.  On  the  other  hand,  if  the  set  is  not  linearly  independent,  then  there  exist 
scalars,  c\,  ■  ■  ■  ,cm  such  that 

m 

o  =  Yj 

i=  1 

and  not  all  the  c\  are  equal  to  zero.  Suppose  /  0.  Then  solve  for  the  vector  i \  in  terms  of  the  other 
vectors.  Consequently, 

V  =  span{vi ,■■■  ,vk_1,vk+1,---  ,vmj 
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contradicting  the  definition  of  m.  This  proves  the  first  part  of  the  theorem. 

To  obtain  the  second  part,  begin  with  {u\,  ■  ■  ■  ,4}  and  suppose  a  basis  for  V  is 

{vi,  ■■■,?„} 


If 


span  {4  +  •  •  ,4}  =  V, 


then  k  —  n.  If  not,  there  exists  a  vector 


4+1  i  span {«!,•■•  ,4} 

Then  from  Lemma  9.23,  {4,-  •  •  ,4>4+i }  is  also  linearly  independent.  Continue  adding  vectors  in  this 
way  until  n  linearly  independent  vectors  have  been  obtained.  Then 

span{4,  •  •  •  ,4}  =  V 

because  if  it  did  not  do  so,  there  would  exist  4+1  as  just  described  and  {4,  •  •  •  ,4+t}  would  be  a  linearly 
independent  set  of  vectors  having  n+  1  elements.  This  contradicts  the  fact  that  {4,  •  •  • ,  vn}  is  a  basis.  In 
turn  this  would  contradict  Theorem  9.35.  Therefore,  this  list  is  a  basis.  4k 

Recall  Example  9.24  in  which  we  added  a  matrix  to  a  linearly  independent  set  to  create  a  larger  linearly 
independent  set.  By  Theorem  9.43  we  can  extend  a  linearly  independent  set  to  a  basis. 


Solution.  Recall  from  the  solution  of  Example  9.24  that  the  set  R  C  Mrj  given  by 


R  = 


1  0 
0  0 


0  1 
0  0 


0  0 
1  0 


is  also  linearly  independent.  However  this  set  is  still  not  a  basis  for  M22  as  it  is  not  a  spanning  set.  In 
^  0  1  is  not  in  span/?.  Therefore,  this  matrix  can  be  added  to  the  set  by  Lemma  9.23  to 


particular,  ^ 
obtain  a  new  linear 


y  independent  set  given  by 
T  = 


1  0 
0  0 


0  1 
0  0 


0  0 
1  0 


0  0 
0  1 


This  set  is  linearly  independent  and  now  spans  M22.  Hence  T  is  a  basis.  4k 

Next  we  consider  the  case  where  you  have  a  spanning  set  and  you  want  a  subset  which  is  a  basis.  The 
above  discussion  involved  adding  vectors  to  a  set.  The  next  theorem  involves  removing  vectors. 
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Theorem  9.45:  Basis  from  a  Spanning  Set 


Let  V  be  a  vector  space  and  let  W  be  a  subspace.  Also  suppose  that  W  =  span  {w\,-  •  • ,  w,„}.  Then 
there  exists  a  subset  of  { w\ ,  •  •  • ,  wm }  which  is  a  basis  for  W. 


Proof.  Let  S  denote  the  set  of  positive  integers  such  that  for  k  G  S,  there  exists  a  subset  of  {ivi,  •  •  • ,  wm } 
consisting  of  exactly  k  vectors  which  is  a  spanning  set  for  W.  Thus  m  G  S.  Pick  the  smallest  positive 
integer  in  S.  Call  it  k.  Then  there  exists  {u\,-  ■  ■  ,*4}  Q  {v?i,  •  •  •  ,wm}  such  that  span{«i,  •  •  •  ,*4}  =  VP.  If 

k 

Y  CiWi  =  0 
(=1 


and  not  all  of  the  c,-  =  0,  then  you  could  pick  cj  4  0,  divide  by  it  and  solve  for  uj  in  terms  of  the  others. 


Then  you  could  delete  Wj  from  the  list  and  have  the  same  span.  In  any  linear  combination  involving  wj, 
the  linear  combination  would  equal  one  in  which  Wj  is  replaced  with  the  above  sum,  showing  that  it  could 
have  been  obtained  as  a  linear  combination  of  Wj  for  i  ^  j.  Thus  k  —  1  G  S  contrary  to  the  choice  of  k  . 
Hence  each  c;-  =  0  and  so  {u\,  ■  ■  ■ ,  i4}  is  a  basis  for  W  consisting  of  vectors  of  {ivi,  •  •  •  ,wm}.  4 


Consider  the  following  example  of  this  concept. 


Example  9.46:  Basis  from  a  Spanning  Set 


Let  V  be  the  vector  space  of  polynomials  of  degree  no  more  than  3,  denoted  earlier  as  P3 .  Consider 
the  following  vectors  in  V. 

2.x2  -|-  x  4”  1 ,  x2  ~\~  4x“  4-  2x  ~\~  2, 2x^  4-  2x2  2x  1 , 
x3  +4x2  —  3x4-2,  x3  +  3X2  +  2x+  1 

Then,  as  mentioned  above,  V  has  dimension  4  and  so  clearly  these  vectors  are  not  linearly  indepen¬ 
dent.  A  basis  for  V  is  {  I .  x,  x2 ,  x3  } .  Determine  a  linearly  independent  subset  of  these  which  has  the 
same  span.  Determine  whether  this  subset  is  a  basis  for  V. 


Solution.  Consider  an  isomorphism  which  maps  M4  to  V  in  the  obvious  way.  Thus 

"  1  " 

1 

2 

_  0  _ 

corresponds  to  2x2  +  x+  1  through  the  use  of  this  isomorphism.  Then  corresponding  to  the  above  vectors 
in  V  we  would  have  the  following  vectors  in  M4. 


1 

2 

1 

2 

1 

1 

2 

2 

-3 

2 

2 

4 

9 

2 

9 

4 

9 

3 

0 

1 

2 

1 

1 
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Now  if  we  obtain  a  subset  of  these  which  has  the  same  span  but  which  is  linearly  independent,  then  the 
corresponding  vectors  from  V  will  also  be  linearly  independent.  If  there  are  four  in  the  list,  then  the 
resulting  vectors  from  V  must  be  a  basis  for  V.  The  reduced  row-echelon  form  for  the  matrix  which  has 
the  above  vectors  as  columns  is 

"  1  0  0  -15  0  ' 

010  11  0 
001  -50 

0  0  0  0  1  _ 

Therefore,  a  basis  for  V  consists  of  the  vectors 

2x  T  x  T  1 , x ^  T  4x  4-  2x  T  2, 2x  T  2x  T  2x  T  1 , 
r'  T  3a-  T  2x  T  1 . 

Note  how  this  is  a  subset  of  the  original  set  of  vectors.  If  there  had  been  only  three  pivot  columns  in 
this  matrix,  then  we  would  not  have  had  a  basis  for  V  but  we  would  at  least  have  obtained  a  linearly 
independent  subset  of  the  original  set  of  vectors  in  this  way. 

Note  also  that,  since  all  linear  relations  are  preserved  by  an  isomorphism, 

-15  (2.x2  +v+  1)  +  11  (x3  +  4x2  +  2x  +  2)  +  (-5)  (2x3  +  2x2  +  2x  +  l) 

=  x3  +  4x2  —  3x  +  2 


Consider  the  following  example. 


* 


Solution.  First  we  need  to  show  that  S  spans  P2.  Let  ax2  +  bx  +  c  be  an  arbitrary  polynomial  in  P2.  Write 

ax2  +  bx  +  c  —  r(l)  +  s(x)  +t(x2)  +u(x2  +  1) 


Then, 


ax2  +  bx  +  c  —  r(l)  +5(a:)  +  t(x2)  +  u(x2  +  1) 
=  (t +  u)x2  +  s(x)  +  (r  +  u) 


It  follows  that 


a  =  t  +  u 
b  - 


c 


s 

r  +  u 
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Clearly  a  solution  exists  for  all  a,b,c  and  so  S  is  a  spanning  set  for  P2.  By  Theorem  9.43,  some  subset 
of  S  is  a  basis  for  P2. 

Recall  that  a  basis  must  be  both  a  spanning  set  and  a  linearly  independent  set.  Therefore  we  must 
remove  a  vector  from  S  keeping  this  in  mind.  Suppose  we  remove  x  from  S.  The  resulting  set  would  be 
{ l,x2,x2  +  1 } .  This  set  is  clearly  linearly  dependent  (and  also  does  not  span  P2)  and  so  is  not  a  basis. 

Suppose  we  remove  x2  +  1  from  S.  The  resulting  set  is  {l,x,x2}  which  is  both  linearly  independent 
and  spans  P2.  Hence  this  is  a  basis  for  P2.  Note  that  removing  any  one  of  1 ,  x2 ,  or  .v2  +  1  will  result  in  a 
basis.  4k 

Now  the  following  is  a  fundamental  result  about  subspaces. 


Proof.  Let  the  dimension  of  V  be  n.  Pick  w\  6  W  where  w  1  ^  0.  If  w\,  -  ■  ■  ,ws  have  been  chosen  such 
that  {w i,---  ,ws}  is  linearly  independent,  if  spanjwi,---  ,wr}  —  W,  stop.  You  have  the  desired  basis. 
Otherwise,  there  exists  ws+  \  ^  span {ivi , ■■■  ,ws}  and  {w  1 , •  •  • , ws, vvy+i }  is  linearly  independent.  Continue 
this  way  until  the  process  stops.  It  must  stop  since  otherwise,  you  could  obtain  a  linearly  independent  set 
of  vectors  having  more  than  n  vectors  which  is  impossible. 

The  last  claim  is  proved  by  following  the  above  procedure  starting  with  {w\ ,  •  •  • ,  ws }  as  above.  4 

This  also  proves  the  following  corollary.  Let  V  play  the  role  of  W  in  the  above  theorem  and  begin  with 
a  basis  for  W ,  enlarging  it  to  form  a  basis  for  V  as  discussed  above. 


Consider  the  following  example. 
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Solution.  An  easy  way  to  do  this  is  to  take  the  reduced  row-echelon  form  of  the  matrix 

'1  0  1  000' 

0  10  10  0 

1  0  0  0  1  0 

1  1  0  0  0  1 


(9.3) 


Note  how  the  given  vectors  were  placed  as  the  first  two  and  then  the  matrix  was  extended  in  such  a  way 
that  it  is  clear  that  the  span  of  the  columns  of  this  matrix  yield  all  of  R4.  Now  determine  the  pivot  columns. 
The  reduced  row-echelon  form  is 


1  0  0  0  1  0 

0100-1  1 
0  0  10-1  0 
0  0  0  1  1  -1 


(9.4) 


These  are 


1 

0 

1 

0 

0 

1 

0 

1 

1 

? 

0 

0 

0 

1 

1 

0 

0 

and  now  this  is  an  extension  of  the  given  basis  for  IT  to  a  basis  for  R4. 

Why  does  this  work?  The  columns  of  9.3  obviously  span  R4  the  span  of  the  first  four  is  the  same  as 
the  span  of  all  six.  4k 


Exercises 


Exercise  9.4.1 
Exercise  9.4.2 
Exercise  9.4.3 


Let  M  —  [u  —  {u\,  U2,  W3,  U4)  G  R4  :  |wi|  <4  ^  .Is  M  a  subspace  of  R4? 

Let  M  —  [u  —  («i, «2>  W3,  W4)  G  R4  :  sin(ui)  =  l}  .  Is  M  a  subspace  0/R4? 
Let  W  be  a  subset  0/M22  given  by 


W=  {A\AeM22,AT  =  A] 


In  words,  W  is  the  set  of  all  symmetric  2x2  matrices.  Is  W  a  subspace  of  M 22? 


Exercise  9.4.4  Let  W  be  a  subset  0/  M22  given  by 


W  = 


a  b 
c  d 


\a,b,c,d  GR,a  +  i  =  c  +  d 


IsWa  subspace  of  M 22  ? 
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Exercise  9.4.5  Let  W  be  a  subset  of  P^  given  by 

W  —  {ax3  +  Z?x2  +  cx  +  d\a,b,c,d  G  R,r/  =  0} 

Is  W  a  subspace  of  P3  ? 

Exercise  9.4.6  Let  W  be  a  subset  ofP 3  given  by 

W  —  {p(x)  —  ax3  +  bx2  +  cx  +  d\a,b,c,  d  G  M,p( 2)  =  1} 

Is  W  a  subspace  ofP 3  ? 


9.5  Sums  and  Intersections 


We  begin  this  section  with  a  definition. 


Therefore  the  intersection  of  two  subspaces  is  all  the  vectors  shared  by  both.  If  there  are  no  vectors 
shared  by  both  subspaces,  meaning  that  l/flW  =  <0[,  the  sum  U  +  W  takes  on  a  special  name. 


Definition  9.52:  Direct  Sum 

Let  V  be  a  vector  space  and  suppose  U  and  W  are  subspaces  ofV  such  that  U  D  W  —  <j 
sum  of  U  and  W  is  called  the  direct  sum  and  is  denoted  U  ©  W . 

ro! 

lUJ 

\ .  Then  the 

An  interesting  result  is  that  both  the  sum  U  +  W  and  the  intersection  U  Pi  W  are  subspaces  of  V. 
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Solution.  By  the  subspace  test,  we  must  show  three  things: 

1.  Oeunw 

2.  For  vectors  v\ ,  v2  G  U  D  W,  v\  +  V2  G  U  fl  W 

3.  For  scalar  a  and  vector  v  G  U  DW  ,av  G  U  flVF 

We  proceed  to  show  each  of  these  three  conditions  hold. 

1.  Since  U  and  W  are  subspaces  of  V,  they  each  contain  0.  By  definition  of  the  intersection,  0  G  U  fl  W. 

2.  Let  vi,v2eU  fl  W,.  Then  in  particular,  vi,V2  G  U .  Since  U  is  a  subspace,  it  follows  that  vi  +  V2  G  U. 
The  same  argument  holds  for  W.  Therefore  vi  +  V2  is  in  both  U  and  W  and  by  definition  is  also  in 

unw. 

3.  Let  a  be  a  scalar  and  v  G  U  fl  W.  Then  in  particular,  v  G  U.  Since  U  is  a  subspace,  it  follows  that 
av  G  U.  The  same  argument  holds  for  W  so  av  is  in  both  U  and  W.  By  definition,  it  is  in  U  (1W. 

Therefore  U  ft  W  is  a  subspace  of  V.  4 

It  can  also  be  shown  that  U  +  W  is  a  subspace  of  V. 

We  conclude  this  section  with  an  important  theorem  on  dimension. 


Notice  that  when  U  fl  W  —  j  o  j ,  the  sum  becomes  the  direct  sum  and  the  above  equation  becomes 

dim(£7  ©  W)  —  dim(I7)  +dim(W) 


9.6  Linear  Transformations 


Outcomes 


A.  Understand  the  definition  of  a  linear  transformation  in  the  context  of  vector  spaces. 


Recall  that  a  function  is  simply  a  transformation  of  a  vector  to  result  in  a  new  vector.  Consider  the 
following  definition. 
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Definition  9.55:  Linear  Transformation 


Let  V  and  W  be  vector  spaces.  Suppose  T  :  V  i is  a  function,  where  for  each  x  e  V,  T  (x)  e  W . 
Then  T  is  a  linear  transformation  if  whenever  k,p  are  scalars  and  v\  and  v2  are  vectors  in  V 

T  (kv i  +  pv2)  =  kT  (vi)  +  pT  (v2) 


Several  important  examples  of  linear  transformations  include  the  zero  transformation,  the  identity 
transformation,  and  the  scalar  transformation. 


Solution.  We  will  show  that  the  scalar  transformation  sa  is  linear,  the  rest  are  left  as  an  exercise. 

By  Definition  9.55  we  must  show  that  for  all  scalars  k,p  and  vectors  v\  and  v2  in  V,  sa  (kv i  +  pv2)  — 
ksa  (vi)  +  psa  (v2)-  Assume  that  a  is  also  a  scalar. 


sa  (kn  +  pv2) 


a(kv  i  +pv2) 
akv  i  +apv2 
k(av  i)  +p(av2) 
ksa  (vi )  +  psa  (V2) 


Therefore  sa  is  a  linear  transformation. 


* 


Consider  the  following  important  theorem. 
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Proof. 


1.  Let  0 y  denote  the  zero  vector  of  V  and  let  0w  denote  the  zero  vector  of  W.  We  want  to  prove  that 
T (0(/)  =  0 w-  Let  v  G  V.  Then  Ov  =  0y  and 

r(ov)  =  r(Ov)  =  or(v)  =  Ow. 


2.  Let  v  G  V;  then  —  v  G  V  is  the  additive  inverse  of  v,  so  v  +  (— v)  =  0 y.  Thus 

L(v+(-v))  =  T( 0V) 

T(v)  +  T(-v))  =  0W 

T(—v)  =  0  W-T(v)  =  -T(v). 

3.  This  result  follows  from  preservation  of  addition  and  preservation  of  scalar  multiplication.  A  formal 
proof  would  be  by  induction  on  m. 


4 


Consider  the  following  example  using  the  above  theorem. 


Solution.  We  provide  two  solutions  to  this  problem. 

Solution  1:  Suppose  a(x 2  +x)  +b(x2  — x )  +c(x2  +  1)  =  4x2  +  5x  —  3.  Then 

(a  +  b  +  c)x2  +  (a  —  b)x  +  c  =  4x2  +  5x  —  3. 
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Solving  for  a,  b,  and  c  results  in  the  unique  solution  a  —  6,  b  —  1,  c  —  —3. 

Thus 

T  (4x2  +  5x  —  3)  =  T(6(x2+x)  +  (x2-x)-3(x2  +  1)) 
=  6T{x2+x)  +  T(x2-x)  -  3T(x2  +  1) 
=  6(— 1)  +  1  —  3(3)  =  —14. 


Solution  2:  Notice  that  S  —  {x2  +x,x2  —x,x2  +  1}  is  a  basis  of  FS,  and  thus  x2,  x,  and  1  can  each  be 
written  as  a  linear  combination  of  elements  of  S. 

x2  —  Jj(x2  Tx)  T \(x2  —  x) 

X  —  \(x2 +x)  —  j(x2  —  x) 

1  =  (x2 +  1)  —  j(x2 +x)  —  \(x2 —x). 

Then 

T(x2)  —  T  (\(x2+x)  +  k(x2-x))  =  \T{x2+x)  +  \T(x2 -x) 

=  2(_1)  +  l(l)  =  0- 

T(x)  —  T  (\(x2  +  x)  —  ^(x2  —  x))  =  \T(x2  +  x)  —  \T(x2  —  x) 

=  ^(-l)-i(l) - 1- 

r(l)  =  T  ((x2  + 1)  -  ^(x2  +x)  -  ±(x2  -x)) 

=  r(x2  +  1)  - \T(x2+x)  -  \_T(x2-x) 

=  3-l(-l)-l(l)  =  3. 

Therefore, 

T(4x2  +  5x-3)  =  4r(x2)+5r(x)-3T(l) 

=  4(0)  +5(— 1)  —3(3)  =  —14. 

The  advantage  of  Solution  2  over  Solution  1  is  that  if  you  were  now  asked  to  find  T(—6x2  —  13x  +  9),  it 
is  easy  to  use  T(x2)  =  0,  T(x)  =  —1  and  T(l)  =  3: 

T(— 6x2  —  13x  +  9)  =  — 6T(x2)  —  13T(x)  +9T(1) 

=  —6(0)  —  13( — 1)  +  9(3)  =  13  +  27  =  40. 


More  generally, 

T  (ax2  +  bx  +  c)  =  aT  (x2)  +  bT  (x)  +  cT  (1) 

=  a(0)  +  £>(  —  !)  +c(3)  =  —b  +  3c. 


* 
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Suppose  two  linear  transformations  act  in  the  same  way  on  v  for  all  vectors.  Then  we  say  that  these 
transformations  are  equal. 


The  definition  above  requires  that  two  transformations  have  the  same  action  on  every  vector  in  or¬ 
der  for  them  to  be  equal.  The  next  theorem  argues  that  it  is  only  necessary  to  check  the  action  of  the 
transformations  on  basis  vectors. 


Theorem  9.60:  Transformation  of  a  Spanning  Set 


Let  V  and  W  be  vector  spaces  and  suppose  that  S  and  T  are  linear  transformations  from  V  to  W. 
Then  in  order  for  S  and  T  to  be  equal,  it  suffices  thatS(vi)  —  T(v,)  where  V  =  span{v i,V2,-  •  •  ,  iy}. 


This  theorem  tells  us  that  a  linear  transformation  is  completely  determined  by  its  actions  on  a  spanning 
set.  We  can  also  examine  the  effect  of  a  linear  transformation  on  a  basis. 


Exercises 


Exercise  9.6.1  Let  T  :  P2  — >  M  he  a  linear  transformation  such  that 

T(x 2)  =  l;T(x2+x)  =  5;T(x2+x+1)  =  -l. 

Find  T  ( ax 2  +  bx  +  c). 

Exercise  9.6.2  Consider  the  following  functions  T  :  R3  — »  R2.  Explain  why  each  of  these  functions  T  is 
not  linear. 
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(a)  T 


x 

y 

z 


x  +  2y  +  3z+l 
2v-3x  +  z 


(b)  T 


x 

y 

z 


x  +  2y2  +  3z 
2y  +  3x  +  z 


(c)  T 


x 

y 

z 


sin*  +  2 y  +  3  z 
2y  +  3x  +  z 


(d)  T 


x 

y 

z 


x  +  2y  +  3z 
2y-\-3x  —  lnz 


Exercise  9.6.3  Suppose  T  is  a  linear  transformation  such  that 


1  ' 

'  3  ' 

T 

1 

= 

3 

-7 

_  3  _ 

'  -1  ' 

'  1  ' 

T 

0 

= 

2 

6  _ 

_  3  _ 

0  ' 

1 

T 

-1 

= 

3 

2 

-1 

Find  the  matrix  ofT.  That  is  find  A  such  that  T  (3c)  =  Ax. 


Exercise  9.6.4  Suppose  T  is  a  linear  transformation  such  that 


T 


1 

2 


-18 


T 


T 


-1 

-1 

15 

0 

-1 

4 


5 

2 

5 

3 

3 

5 


2 

5 

-2 


Find  the  matrix  ofT.  That  is  find  A  such  that  T  (x) 


=  Ax. 


Exercise  9.6.5  Consider  the  following  functions  T  :  R3  — »  R2.  Show  that  each  is  a  linear  transformation 
and  determine  for  each  the  matrix  A  such  that  T  (3c)  =  Ax. 
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(a)  T 


x 

y 

z 


x  +  2;y  +  3z 
2y  —  3  x  +  z 


(b)  T 


x 

y 

z 


lx  +  2y  +  z 
3x  —  1  ly  +  2z 


(c)  T 


x 

y 

z 


3x  +  2  y  +  z 
x  +  2_y  +  6z 


id)  T 


x 

y 

z 


2 y  -  5x  +  z 
x  +  y  +  z 


Exercise  9.6.6  Suppose 

[  A> 

exists  where  each  Aj  £  R"  and  let  vectors  {B\,- 
linear  transformation  T  such  that  T (Aj)  =  Bt. 


■  An  J 

■  .  Bn }  in  Rm  be  given.  Show  that  there  always  exists  a 


9.7  Isomorphisms 


Outcomes 


A.  Apply  the  concepts  of  one  to  one  and  onto  to  transformations  of  vector  spaces. 

B.  Determine  if  a  linear  transformation  of  vector  spaces  is  an  isomorphism. 

C.  Determine  if  two  vector  spaces  are  isomorphic. 


9.7.1.  One  to  One  and  Onto  Transformations 


Recall  the  following  definitions,  given  here  in  terms  of  vector  spaces. 
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Recall  that  every  linear  transformation  T  has  the  property  that  T (0)  =  0.  This  will  be  necessary  to 
prove  the  following  useful  lemma. 


Lemma  9.64:  One  to  One 


The  assertion  that  a  linear  transformation  T  is  one  to  one  is  equivalent  to  saying  that  ifT(v)  —  0, 
then  v  —  0. 


Proof.  Suppose  first  that  T  is  one  to  one. 

r(o)  =  r(o+o)  =  r(o)  +  r(o) 

and  so,  adding  the  additive  inverse  of  T( 0)  to  both  sides,  one  sees  that  T{ 0)  =  0.  Therefore,  if  T(v)  —  0, 
it  must  be  the  case  that  v  =  0  because  it  was  just  shown  that  T (0)  =  0. 

Now  suppose  that  if  T (v)  =  0,  then  v  —  0.  If  T(v)  —  T (w),  then  T[v)  —T(u)  —  T  (v  —  u )  =0  which 
shows  that  v  —  u  —  0  or  in  other  words,  v  —  u. 

Consider  the  following  example. 


Solution.  By  definition, 

ker(S)  =  {ax2  +  bx  +  c  G  P2  |  a  +  b  —  0,a  + c  =  0,b  —  c  —  0,b  + c  —  0}. 


Suppose  p(x)  =  ax2  +  bx  +  c  G  ker(5) .  This  leads  to  a  homogeneous  system  of  four  equations  in  three 
variables.  Putting  the  augmented  matrix  in  reduced  row-echelon  form: 


"  1 

1 

0 

0  ' 

1 

0 

1 

0 

0 

1 

-1 

0 

0 

1 

1 

0 

1  0  0 
0  1  0 
0  0  1 
0  0  0 


0 

0 

0 

0 


The  solution  is  a  —  b  —  c  —  0.  This  tells  us  that  if  S{p(x))  =  0,  then  p(x)  =  ax2  +  bx+c  —  Ox2  +  0x+0  = 
0.  Therefore  it  is  one  to  one. 
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To  show  that  S  is  not  onto,  find  a  matrix  A  G  M22  such  that  for  every  p(x)  G  PG,  S(p(x))  /  A.  Let 


A 


0  1 
0  2  ’ 


and  suppose  p(x)  =  ax2  +  bx  +  c  G  P2  is  such  that  S(p(x))  =  A.  Then 


£7 T ^  —  0  a  +  c  =  1 
b  —  c  —  0  b  +  c  —  2 


Solving  this  system 


"  1 

1 

0 

0  ' 

'  1 

1 

0 

0  ' 

1 

0 

1 

1 

-A 

0  - 

1 

1 

1 

0 

1 

-1 

0 

0 

1  -1 

0 

0 

1 

1 

2 

0 

1 

1 

2 

Since  the  system  is  inconsistent,  there  is  no  p(x)  G  P2  so  that  S(p(x))  =  A,  and  therefore  S  is  not  onto. 

* 


Example  9.66:  An  Onto  Transformation 


Let  T  :  M22  — >  M2  be  a  linear  transformation  defined  by 


a  b 
c  d 


Prove  that  T  is  onto  but  not  one  to  one. 


a  +  d 
b  +  c 


for  all 


a  b 
c  d 


G  M22- 


Solution.  Let 


be  an  arbitrary  vector  in  R~.  Since  T 


x  y 

X 

0  0 

y 

,  T  is  onto. 


By  Lemma  9.64  T  is  one  to  one  if  and  only  if  T (A)  =  0  implies  that  A  =  0  the  zero  matrix.  Observe 


t 

'  1  0 

'  1  +  -1 ' 

'  0 ' 

l 

0  -1 

)- 

0+0 

0 

that 


There  exists  a  nonzero  matrix  A  such  that  T (A)  =  0.  It  follows  that  T  is  not  one  to  one.  4k 

The  following  example  demonstrates  that  a  one  to  one  transformation  preserves  linear  independence. 


Example  9.67:  One  to  One  and  Independence 


LetV  and  W  be  vector  spaces  and  T  :  V  ha  W  a  linear  transformation.  Prove  that  if  T  is  one  to  one 
and  {vi,v2,. . . ,  vy}  is  an  independent  subset  ofV,  then  (L(vi),  T(v2),  ■  ■  ■  ,T{vk)}  is  an  independent 
subset  ofW. 


Solution.  Let  Qy  and  0w  denote  the  zero  vectors  of  V  and  W,  respectively.  Suppose  that 


aiT(vi)+a2T(v2)-\ - b  akT{vk)  =  0W 


508  Vector  Spaces 


for  some  a\,ci2,  ■  ■  ■  ,ak  £  R-  Since  linear  transformations  preserve  linear  combinations  (addition  and 
scalar  multiplication), 

T  (a  i  vi  +  a2 v2  H - b  akvk)  =  0W . 

Now,  since  T  is  one  to  one,  ker(T)  =  {0v},  and  thus 

a i  vi  +  a2 V2  H - b  akvk  =  0y  • 

However,  {v\,V2, ■  ■  ■  ,vk}  is  independent  so  ci\  —  ci2  —  ■  ■  ■  —  ak  —  0.  Therefore,  {T(v\),T(y2),  ■  ■  .,T(yk)} 
is  independent.  4 

A  similar  claim  can  be  made  regarding  onto  transformations.  In  this  case,  an  onto  transformation 
preserves  a  spanning  set. 


Solution.  Suppose  that  T  is  onto  and  let  w  E  W.  Then  there  exists  v  E  V  such  that  T{v)  —  w.  Since 

V  =  span{vi,V2>- . .  ,v/J,  there  exist  a\,a.2,.  ■  .ak  eM  such  that  v  —  a\V\  -\-a2V2  H - ba^vt.  Using  the  fact 

that  T  is  a  linear  transformation, 


w  =  T(v)  =  r(aiVi+a2V2H - b  akvk) 

=  a\T{v\)  +a2T(y2)  H - f  akT(vk), 

i.e.,  w  E  span{r(vi),r(v2),...,T(v/t)},  and  thus 


W  C  span{r(v1),T(v2), •  •  •  ,T(%)}- 


Since  T(v\),T(v2),  ■  ■  -,T(yk)  E  W,  it  follows  from  that  span{T(vi),r(v2), . . . ,  T{yk)}  C  W,  and  there¬ 
fore  W  =  span  {T(v1),r(v2),...,r(v/t)}.  4 
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9.7.2.  Isomorphisms 


The  focus  of  this  section  is  on  linear  transformations  which  are  both  one  to  one  and  onto.  When  this  is  the 
case,  we  call  the  transformation  an  isomorphism. 


Definition  9.70:  Isomorphic 


Let  V  and  W  be  two  vector  spaces  and  let  T  :  V  i— >•  W  be  a  linear  transformation.  Then  if  T  is  an 
isomorphism,  we  say  that  V  and  W  are  isomorphic. 


Consider  the  following  example  of  an  isomorphism. 


Solution.  Notice  that  if  we  can  prove  T  is  an  isomorphism,  it  will  mean  that  M22  and  M4  are  isomorphic. 
It  remains  to  prove  that 


1.  T  is  a  linear  transformation; 

2.  T  is  one-to-one; 

3.  T  is  onto. 

T  is  linear:  Let  k,p  be  scalars. 

a2  b2  \ 
c2  d2  \) 


T  k 


a\  b\ 
ci  d\ 


+  P 


T 


ka\  kb\ 
kc\  kd\ 


pa2  pb2  \ 
pc2  pd2  \ ) 
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=  T 


ka\  +  pci2  kb\  +  pb2 
kc\  +  pc2  kcl\  +  pd2 

ka\  +  pct2 
kb i +  pb2 
kc  i  +  pc2 
kd\  +  pdi 
ka\ 


kb  i 
kc  i 
kd\ 


+ 


PC12 
pb  2 
PC2 
Pd2 


—  k 


=  kT 


a  1 

Cl2 

by 

+  P 

b2 

Cl 

C2 

d\ 

d2 

a\  b\ 
c\  d\ 


+  pT 


a2  b2 
C2  d2 


Therefore  T  is  linear. 


T  is  one-to-one:  By  Lemma  9.64  we  need  to  show  that  if  T (A)  =  0  then  A  —  0  for  some  matrix 

A  G  MI22- 


a 

'  0  ' 

a  b 

b 

0 

c  d 

c 

0 

d 

0 

This  clearly  only  occurs  when  a  —  b  —  c  —  d  —  0  which  means  that 

-0 


A  = 


a  b 

1 

0 

0 

c  d 

0  0 

Hence  T  is  one-to-one. 

T  is  onto:  Let 


x  — 


and  define  matrix  A  G  M22  as  follows: 


A  — 


Xl 

x 2 
x 3 

X4 


Xl  X2 
X3  X4 


Then  T (A)  —  x,  and  therefore  T  is  onto. 

Since  T  is  a  linear  transformation  which  is  one-to-one  and  onto,  T  is  an  isomorphism.  Hence  M22  and 
K4  are  isomorphic.  4k 

An  important  property  of  isomorphisms  is  that  the  inverse  of  an  isomorphism  is  itself  an  isomorphism 
and  the  composition  of  isomorphisms  is  an  isomorphism.  We  first  recall  the  definition  of  composition. 
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Consider  now  the  following  proposition. 


Proposition  9.73:  Composite  and  Inverse  Isomorphism 


Let  T  :V  — )•  IT  be  an  isomorphism.  Then  T  1  :  W  — *  V  is  also  an  isomorphism.  Also  ifT:V—>W 
is  an  isomorphism  and  ifS  :  W  —r  Z  is  an  isomorphism  for  the  vector  spaces  V ,W ,Z,  then  S  o  T 
defined  by  (SoT)  (v)  —  S(T  (v))  is  also  an  isomorphism. 


Proof.  Consider  the  first  claim.  Since  T  is  onto,  a  typical  vector  in  W  is  of  the  form  T(v)  where  v  G  V. 
Consider  then  for  a,b  scalars, 

T~l{aT{vi)  +  bT{h)) 

where  iq,  v2  G  V.  Consider  if  this  is  equal  to 

aT~x  (T(vx))  +  bT~x  (T(v2))  -  avi  +  bv2l 

Since  T  is  one  to  one,  this  will  be  so  if 

T  (av\+bv2)  =  T  (T~x  (aT(v\)  +  bT(v2)))  =  aT(v\)  +bT(v2) 

However,  the  above  statement  is  just  the  condition  that  T  is  a  linear  map.  Thus  T  1  is  indeed  a  linear  map. 
If  v  G  V  is  given,  then  v  —  T~l  (T(v))  and  so  is  onto.  If  T_1(v)  =  0,  then 

v  =  r(r-1(v))  =r(o)  =  o 


and  so  T  1  is  one  to  one. 

Next  suppose  T  and  S  are  as  described.  Why  is  S  o  T  a  linear  map?  Let  for  a,  b  scalars, 

SoT  (avi+bv2)  =  S(T(avi  +bv2))  —  S(aT(vi)  +bT(v2)) 

=  aS(T{vi))+bS(T(v2))  =  a(SoT)  (n)  +  b(SoT)  (v2) 

Hence  SoT  is  a  linear  map.  If  (SoT)  (v)  =  0,  then  S  (T  (v))  =  0  and  it  follows  that  T(v)  =0  and  hence 
by  this  lemma  again,  v  =  0.  Thus  S  o  T  is  one  to  one.  It  remains  to  verify  that  it  is  onto.  Let  z  G  Z.  Then 
since  S  is  onto,  there  exists  w  G  W  such  that  5(w)  =  z.  Also,  since  T  is  onto,  there  exists  v  e  V  such  that 
T (v)  =  w.  It  follows  that  S(T  (v))  =z  and  so  S  o  T  is  also  onto.  4 

Suppose  we  say  that  two  vector  spaces  V  and  W  are  related  if  there  exists  an  isomorphism  of  one  to 
the  other,  written  as  V  ~  W.  Then  the  above  proposition  suggests  that  ~  is  an  equivalence  relation.  That 
is:  ~  satisfies  the  following  conditions: 
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.  y  ~  V 

•  If  V  ~  W,  it  follows  that  W  ~  V 

•  If  V  ~  W  and  IV  ~  Z,  then  V  ~  Z 

We  leave  the  proof  of  these  to  the  reader. 

The  following  fundamental  lemma  describes  the  relation  between  bases  and  isomorphisms. 


Lemma  9.74:  Bases  and  Isomorphisms 


Let  T  \  V  —r  W  be  a  linear  map  where  V  ,W  are  vector  spaces.  Then  a  linear  transformation 
T  which  is  one  to  one  has  the  property  that  if  {u\,-  ■■ , 4 }  is  linearly  independent,  then  so  is 
{L(mi ), •  •  • , T{uk)}.  More  generally,  T  is  an  isomorphism  if  and  only  if  whenever  {vi,---  ,v„} 
is  a  basis  for  V,  it  follows  that  {T  (v\)  ,■  ■  ■  ,T  (vn)}  is  a  basis  for  W . 


Proof.  First  suppose  that  T  is  a  linear  map  and  is  one  to  one  and  {ui,  ■  ■  ■  ,14}  is  linearly  independent.  It  is 
required  to  show  that  {T(u\ ), •  •  •  ,T (4) }  is  also  linearly  independent.  Suppose  then  that 

k 

LqL(4)  =  0 

i=  1 


Then,  since  T  is  linear, 

T[piCiU^j  =0 

Since  T  is  one  to  one,  it  follows  that 

n 

Y,  cw  =  0 

i=l 

Now  the  fact  that  {u\,  ■  ■  ■ ,  un}  is  linearly  independent  implies  that  each  ct  =  0.  Hence  {T{u\),  ■  ■  ■ ,  T ( u„ ) } 
is  linearly  independent. 

Now  suppose  that  T  is  an  isomorphism  and  {vi,--  -  ,vn}  is  a  basis  for  V.  It  was  just  shown  that 
{T'(vi),-  •  ■  ,L(v„)}  is  linearly  independent.  It  remains  to  verify  that  the  span  of  {L(vi),  -  •  •  ,T(vn)}  is  all 
of  W.  This  is  where  T  is  onto  is  used.  If  w  e  W,  there  exists  v  G  V  such  that  T (v)  =  w.  Since  {vi,  ■  ■  •  ,vn} 
is  a  basis,  it  follows  that  there  exists  scalars  {c,}”=1  such  that 

n 

Y  CiVi  =  V. 
i=  1 


Hence, 


(n  \  n 

Ywi)  ~  Y^C‘T^‘ 


which  shows  that  the  span  of  these  vectors  (L(vi),  •  •  • ,  T (v„)}  is  all  of  W  showing  that  this  set  of  vectors 
is  a  basis  for  W. 


Next  suppose  that  T  is  a  linear  map  which  takes  a  basis  to  a  basis.  Then  for  ,vn}  a  basis 

for  V,  it  follows  {r(vj),...  ,T(vn)}  is  a  basis  for  W.  Then  if  w  6  W,  there  exist  scalars  c,  such  that 
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w  =  Y!!=]  CjT (v,-)  =  r  (Lt,  CiVi)  showing  that  T  is  onto.  If  T  (£?=1  qv,-)  =  0  then  £”=1  c,T (v;)  =  0  and 
since  the  vectors  {T(vi),  •  •  • ,  T (vn)}  are  linearly  independent,  it  follows  that  each  c,  =  0.  Since  YH=\  cm 
is  a  typical  vector  in  V,  this  has  shown  that  if  T(v)  —  0  then  v  =  0  and  so  T  is  also  one  to  one.  Thus  T  is 
an  isomorphism.  4 

The  following  theorem  illustrates  a  very  useful  idea  for  defining  an  isomorphism.  Basically,  if  you 
know  what  it  does  to  a  basis,  then  you  can  construct  the  isomorphism. 


Theorem  9.75:  Isomorphic  Vector  Spaces 


Suppose  V  and  W  are  two  vector  spaces.  Then  the  two  vector  spaces  are  isomorphic  if  and  only  if 
they  have  the  same  dimension.  In  the  case  that  the  two  vector  spaces  have  the  same  dimension,  then 
for  a  linear  transformation  T  :V  — )■  W,  the  following  are  equivalent. 

1.  T  is  one  to  one. 

2.  T  is  onto. 

3.  T  is  an  isomorphism. 


Proof.  Suppose  first  these  two  vector  spaces  have  the  same  dimension.  Let  a  basis  for  V  be  {vj,  •  •  •  ,vn} 
and  let  a  basis  for  W  be  {w i,  •  •  •  ,wn}-  Now  define  T  as  follows. 


T(vi)  =  w , 


for  YJl=\  cTi  an  arbitrary  vector  of  V, 


T  t'fii )  =  LC'T(^)  =  I>v/. 


<i=  1 


i=l 


i=  1 


It  is  necessary  to  verify  that  this  is  well  defined.  Suppose  then  that 

n  n 

£  cm  =  £  cm 

i=  1  i=  1 


Then 


£  (ci  -  Ci)  Vi  =  0 
1=1 


and  since  {vi,  •  •  •  ,vn}  is  a  basis,  c;  =  c,  for  each  i.  Hence 

n  n 

£  CiWi  =  £  cm 
i=  1  i=  1 

and  so  the  mapping  is  well  defined.  Also  if  a,b  are  scalars, 

T  (  a  £  c,-Vf  +  b  £  cm  )  =  T  (  £  (acf  +  fee,-)  v*  )  =  £  (acf  +  Z>cf) 


w; 


i=  1  i=  1 


y/=l 


1=1 
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n  n 

=  a  E  cm  +  b  E  cm 

i=  1  i=l 


+  bT 


Thus  T  is  a  linear  map. 

Now  if 

CiVj  \  = 

then  since  the  {w i,  •  •  • ,  vvn}  are  independent,  each  c,-  =  0  and  so  £"=1  c/v,-  =  0  also.  Hence  T  is  one  to  one. 
If  £”=1  c,vv;  is  a  vector  in  W,  then  it  equals 


n  j  n  \ 

E CiTVi  =  r  E 

i=i  \i=i  / 

showing  that  T  is  also  onto.  Hence  T  is  an  isomorphism  and  so  V  and  W  are  isomorphic. 

Next  suppose  these  two  vector  spaces  are  isomorphic.  Let  T  be  the  name  of  the  isomorphism.  Then 
for  {vi,  ■■■,?„}  a  basis  for  V,  it  follows  that  a  basis  for  W  is  { Tv\ ,  •  •  • ,  Tvn}  showing  that  the  two  vector 
spaces  have  the  same  dimension. 

Now  suppose  the  two  vector  spaces  have  the  same  dimension. 

First  consider  the  claim  that  1.)  =$■  2.).  If  T  is  one  to  one,  then  if  {vi,  •  •  • , v„}  is  a  basis  for  V,  then 
{r(vi),-»  ,r(v„)}  is  linearly  independent.  If  it  is  not  a  basis,  then  it  must  fail  to  span  W.  But  then 
there  would  exist  w  ^  span{T(vi),  •  •  •  ,T(vn)}  and  it  follows  that  (7,(vi),---  ,T(vn),w}  would  be  linearly 
independent  which  is  impossible  because  there  exists  a  basis  for  W  of  n  vectors.  Hence 

span{T(vi),  •  ■■  ,T(vn)}  =  W 

and  so  (T(vi),  •  •  • ,  T (v„)}  is  a  basis.  Hence,  if  w  G  W,  there  exist  scalars  c,  such  that 

Civ)j 

showing  that  T  is  onto.  This  shows  that  1.)  =>■  2.). 

Next  consider  the  claim  that  2.)  3.).  Since  2.)  holds,  it  follows  that  T  is  onto.  It  remains  to  verify 

that  T  is  one  to  one.  Since  T  is  onto,  there  exists  a  basis  of  the  form  {T (v/),  •  •  • ,  T (v„)} .  If  {vi,  •  •  • ,  vn}  is 
linearly  independent,  then  this  set  of  vectors  must  also  be  a  basis  for  V  because  if  not,  there  would  exist 
u  ^  span  {vi ,  •  •  • ,  vn}  so  {vi,  •  •  • ,  vn,  U}  would  be  a  linearly  independent  set  which  is  impossible  because  by 
assumption,  there  exists  a  basis  which  has  n  vectors.  So  why  is{vi,  •  •  •  ,vn}  linearly  independent?  Suppose 

tcm  =  o 

i=  1 


W  = 


^CiT(vi)  =  T  £ 


i=  1 


^=1 


E  CiTvi  =  0 

i=  1 


Then 
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Hence  each  c,-  =  0  and  so,  as  just  discussed,  {vi,  •  •  • , v„}  is  a  basis  for  V.  Now  it  follows  that  a  typical 
vector  in  V  is  of  the  form  £”=1  c,-v;.  If  T  (£”=1  c/v,)  =  0,  it  follows  that 

i>nv,-)=o 

1=1 

and  so,  since  {L(v,-),  •  •  • ,  T(vn)}  is  independent,  it  follows  each  a  =  0  and  hence  £"=1  c;v,-  =  0.  Thus  L  is 
one  to  one  as  well  as  onto  and  so  it  is  an  isomorphism. 

If  T  is  an  isomorphism,  it  is  both  one  to  one  and  onto  by  definition  so  3.)  implies  both  1.)  and  2.).  4k 

Note  the  interesting  way  of  defining  a  linear  transformation  in  the  first  part  of  the  argument  by  describ¬ 
ing  what  it  does  to  a  basis  and  then  “extending  it  linearly”. 

Consider  the  following  example. 


Example  9.76: 


Let  V  —  R3  and  let  W  denote  the  polynomials  of  degree  at  most  2.  Show  that  these  two  vector 
spaces  are  isomorphic. 


Solution.  First,  observe  that  a  basis  for  W  is  |l,x,x2}  and  a  basis  for  V  is  { e\,e2,e 3}.  Since  these  two 
have  the  same  dimension,  the  two  are  isomorphic.  An  example  of  an  isomorphism  is  this: 

T{e\)  =  l,r(e2)  =x,r(e3)  =x2 

and  extend  T  linearly  as  in  the  above  proof.  Thus 

T  (a,  b,c)  —  a  +  bx  +  cx2 

4k 


Exercises 


Exercise  9.7.1  Let  V  and  W  be  subspaces  of  R”  and  R'"  respectively  and  let  T  :  V  — >  W  be  a  linear 
transformation.  Suppose  that  {Tv  1,  •  •  • ,  Tv,  }  is  linearly  independent.  Show  that  it  must  be  the  case  that 
{vi,  •  •  • , vr}  is  also  linearly  independent. 


Exercise  9.7.2  Let 


V  =  span 


1 

0 

1 

1 

1 

1 

1 

( 

2 

5 

1 

9 

0 

0 

1 

1 

1 

1111 
0  110 
0  12  1 
1112 


Let  Tx  —  Ax  where  A  is  the  matrix 


516  Vector  Spaces 


Give  a  basis  for  im(T). 

r  r 1 1  m  m 

jo  1  4 

span  <  0  ,  j  ,  4 

l L i J  L i J  U_ 

0  110 
0  12  1 

1112 

Find  a  basis  for  im(T).  In  this  case,  the  original  vectors  do  not  form  an  independent  set. 

Exercise  9.7.4  If  {vi ,  •  •  • ,  vr}  is  linearly  independent  and  T  is  a  one  to  one  linear  transformation,  show 
that  {Tv i,  •  •  • ,  Tv,-}  is  cdso  linearly  independent.  Give  an  example  which  shows  that  ifT  is  only  linear,  it 
can  happen  that,  although  {vi,  •  •  • ,  vr}  is  linearly  independent,  {Tv  i,  •  •  •  ,Tvr}  is  not.  In  fact,  show  that  it 
can  happen  that  each  of  the  Tv  j  equals  0. 

Exercise  9.7.5  Let  V  and  W  be  subspaces  of  R"  and  Wn  respectively  and  let  T  :  V  —y  W  be  a  linear 
transformation.  Show  that  if  T  is  onto  W  and  if  {  v  i ,  •  •  •  ,v,-}  is  a  basis  for  V,  then  span{Tv\,-  ■  ■  ,Tvr}  — 
W. 

Exercise  9.7.6  Define  T  :  M4  — y  R3  as  follows. 

'32  18' 

Tx—  2  2  —2  6  x 
11-13 

Find  a  basis  for  im(T).  Also  find  a  basis  for  ker(T) . 

Exercise  9.7.7  Define  T  :  M3  — y  R3  as  follows. 

'12  0" 

Tx—  111  x 
_  0  1  1  _ 

where  on  the  right,  it  is  just  matrix  multiplication  of  the  vector  x  which  is  meant.  Explain  why  T  is  an 
isomorphism  of  R3  to  R3. 

Exercise  9.7.8  Suppose  T  :  R3  — »  R3  is  a  linear  transformation  given  by 

Tx  =  Ax 

where  A  is  a  3x3  matrix.  Show  that  T  is  an  isomorphism  if  and  only  if  A  is  invertible. 

Exercise  9.7.9  Suppose  T  :  R”  — »  R"7  is  a  linear  transformation  given  by 

Tx  =  Ax 
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where  A  is  an  m  x  n  matrix.  Show  that  T  is  never  an  isomorphism  if  m  f  n.  In  particular,  show  that  if 
m  >  77,  T  cannot  be  onto  and  ifm  <  n ,  then  T  cannot  be  one  to  one. 

Exercise  9.7.10  Define  T  :  R2  — y  R3  as  follows. 

'10' 

Tx—  11  x 

_  0  1  _ 

where  on  the  right,  it  is  just  matrix  multiplication  of  the  vector  x  which  is  meant.  Show  that  T  is  one  to 
one.  Next  let  W  —  im(T ) .  Show  that  T  is  an  isomorphism  o/R2  and  im  ( T ). 


Exercise  9.7.11  In  the  above  problem,  find  a  2x3  matrix  A  such  that  the  restriction  of  A  to  im  ( T )  gives 
the  same  result  as  T  1  on  im  ( T ).  Hint:  You  might  let  A  be  such  that 


now  find  another  vector  v  G  R3  such  that 


is  a  basis.  You  could  pick 


for  example.  Explain  why  this  one  works  or  one  of  your  choice  works.  Then  you  could  define  Av  to  equal 
some  vector  in  R2.  Explain  why  there  will  be  more  than  one  such  matrix  A  which  will  deliver  the  inverse 
isomorphism  T  1  on  im{T). 


Exercise  9.7.12  Now  let  V  equal  span 
where 


and  let  T  :  V  —y  W  be  a  linear  transformation 
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Explain  why  T  is  an  isomorphism.  Determine  a  matrix  A  which,  when  multiplied  on  the  left  gives  the  same 
result  as  T  on  V  and  a  matrix  B  which  delivers  I  1  on  W.  Hint:  You  need  to  have 


1  0 
0  1 
1  1 


1  0 
0  1 
1  1 
0  1 


'  1  ' 

'  0  ' 

Now  enlarge 

0 

9 

1 

1 

1 

to  obtain  a  basis  for  R3.  You  could  add  in 


0 

0 

1 


for  example,  and  then  pick 


another  vector  in  R4  and  let  A 


0 

0 

1 


equal  this  other  vector.  Then  you  would  have 


1  0  0 
0  1  0 
1  1  1 


1  0  0 
0  1  0 
1  1  0 
0  1  1 


This  would  involve  picking  for  the  new  vector  in  R4  the  vector  [  0  0  0  1  ]  .  Then  you  could  find  A. 
You  can  do  something  similar  to  find  a  matrix  for  T  1  denoted  as  B. 


9.8  The  Kernel  And  Image  Of  A  Linear  Map 


Outcomes 


A.  Describe  the  kernel  and  image  of  a  linear  transformation. 

B.  Use  the  kernel  and  image  to  determine  if  a  linear  transformation  is  one  to  one  or  onto. 


Here  we  consider  the  case  where  the  linear  map  is  not  necessarily  an  isomorphism.  First  here  is  a 
definition  of  what  is  meant  by  the  image  and  kernel  of  a  linear  transformation. 
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Then  in  fact,  both  im(T)  and  ker  ( T )  are  subspaces  of  W  and  V  respectively. 


Proof.  First  consider  ker  (7) .  It  is  necessary  to  show  that  if  vi,V2  are  vectors  in  ker (7)  and  if  a,b  are 
scalars,  then  av\  +  bv 2  is  also  in  ker  (7) .  But 

7  (av  1  +  bv 2)  —  aT(yi)  +  bT(v 2)  —  aO  +  bO  —  0 

Thus  ker  (7)  is  a  subspace  of  V. 

Next  suppose  7 (vi),  7 (F2)  are  two  vectors  in  im (T) .  Then  if  a,b  are  scalars, 

aT (v2)  +  bT (v 2)  =  T  (avi+bv2) 

and  this  last  vector  is  in  im  ( T)  by  definition.  4* 

Consider  the  following  example. 


Solution.  We  will  first  find  the  kernel  of  T.  It  consists  of  all  polynomials  in  Pi  that  have  1  for  a  root. 

ker(7)  =  {p(x)  G  Pi  |  p(l)  =  0} 

=  {ax  +  b  |  a,b  6l  and  a  +  b  —  0} 

=  {ax  —  a  |  a  G  M} 

Therefore  a  basis  for  ker (T)  is 

{*-!} 

Notice  that  this  is  a  subspace  of  Pi. 

Now  consider  the  image.  It  consists  of  all  numbers  which  can  be  obtained  by  evaluating  all  polynomi¬ 
als  in  Pi  at  1. 


im(7)  =  {/?(1)  |  p(x)  ePi} 

=  {a  +  b  |  ax  +  b  e  Pi} 
=  {a  +  b\a,beR} 

=  M 
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Therefore  a  basis  for  im(r)  is 


{1} 


Notice  that  this  is  a  subspace  of  R,  and  in  fact  is  the  space  R  itself. 


* 


Solution.  You  can  verify  that  T  represents  a  linear  transformation. 

Now  we  want  to  find  a  way  to  describe  all  matrices  A  such  that  T (A)  —  0,  that  is  the  matrices  in  ker(T). 
a  b 


Suppose  A  — 


c  d 


is  such  a  matrix.  Then 


a  b 

a  —  b 

'  0  ' 

c  d 

c  H-  d 

0 

The  values  of  a,b,c,d  that  make  this  true  are  given  by  solutions  to  the  system 


a  —  b  —  0 
c  T  d  —  0 


The  solution  is  a  —  s,b  —  s,c  —  t,d  —  —t  where  s,t  are  scalars.  We  can  describe  ker(T)  as  follows. 


ker(T)  = 


s  x 
t  -t 


—  span 


1  1 
0  0 


0  0 

1  -1 


It  is  clear  that  this  set  is  linearly  independent  and  therefore  forms  a  basis  for  ker(T). 
We  now  wish  to  find  a  basis  for  im(T).  We  can  write  the  image  of  T  as 


im(r)  = 


a  —  b 
c  +  d 


Notice  that  this  can  be  written  as 


span 


1 

0 


-1 

0 


0 

1 


0 

1 


However  this  is  clearly  not  linearly  independent.  By  removing  vectors  from  the  set  to  create  an  inde¬ 
pendent  set  gives  a  basis  of  im(T). 


Notice  that  these  vectors  have  the  same  span  as  the  set  above  but  are  now  linearly  independent.  4k 
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A  major  result  is  the  relation  between  the  dimension  of  the  kernel  and  dimension  of  the  image  of  a 
linear  transformation.  A  special  case  was  done  earlier  in  the  context  of  matrices.  Recall  that  for  an  m  x  n 
matrix  A,  it  was  the  case  that  the  dimension  of  the  kernel  of  A  added  to  the  rank  of  A  equals  n. 


Theorem  9.81:  Dimension  of  Kernel  +  Image 


Let  T  :  V  — >  W  be  a  linear  transformation  where  V,W  are  vector  spaces.  Suppose  the  dimension  of 
V  is  n.  Then  n  =  dim(ker(L))  +dim (im(T)). 


Proof.  From  Proposition  9.78,  im(L)  is  a  subspace  of  W.  By  Theorem  9.48,  there  exists  a  basis  for 
im(r)  ,{r(vi),  -  •  •  ,r(vr)}  .  Similarly,  there  is  a  basis  for  ker  (T)  ,{u\,  ■  ■  ■  ,us }.  Then  if  v  e  V,  there  exist 
scalars  c,-  such  that 

T(v)  =  £c,T(v,) 

i=  1 

Hence  T  (v  —  Y!i=\  cAi)  —  0-  It  follows  that  v  —  YJl=  \  c'Ai  is  in  ker  (7’).  Hence  there  are  scalars  a,-  such  that 

r  s 

V-  YjCAi  =  Y*aj™j 
i=  1  7=1 

Hence  v  —  £f=1  c,\’L  +  Yfj=i  a  AlJr  Since  v  is  arbitrary,  it  follows  that 

V  =  spanjwi,--- , Us,v\,  ,vr} 

If  the  vectors  {ui,  -  ■  ■  ,us,v i,---  ,vr}  are  linearly  independent,  then  it  will  follow  that  this  set  is  a  basis. 
Suppose  then  that 

r  5 

£  CiVj  +  £  ajUj  =  0 
z=l  7=1 

Apply  T  to  both  sides  to  obtain 

r  s  r 

E CiT  (v/)  +  E  aJT  (“7)  =  E  c'r  =  0 

7=1  j=l  7=1 

Since  {L(vi),---  ,T(vr)}  is  linearly  independent,  it  follows  that  each  c;-  =  0.  Hence  Yfj=iajUj  —  0  and 
so,  since  the  {ui,  -  ■  ■  ,us}  are  linearly  independent,  it  follows  that  each  aj  —  0  also.  It  follows  that 
{mi,--  ■  ,us,v  i,-  ■  ■  ,vr}  is  a  basis  for  V  and  so 

n  —  s  +  r  —  dim(ker(L))  +dim(im(L)) 


4 


Consider  the  following  definition. 


Definition  9.82:  Rank  of  Linear  Transformation 


Let  T  :  V  — ^  W  be  a  linear  transformation  and  suppose  V,W  are  finite  dimensional  vector  spaces. 
Then  the  rank  ofT  denoted  as  rank(T)  is  defined  as  the  dimension  of  im  (T) .  The  nullity  ofT  is 
the  dimension  of  ker  (T) .  Thus  the  above  theorem  says  that  rank(T)  +  dim  (ker  (Lj)  =  dim  (V) . 
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Recall  the  following  important  result. 


Theorem  9.83:  Subspace  of  Same  Dimension 


Let  V  be  a  vector  space  of  dimension  n  and  let  W  be  a  subspace.  Then  W  —  V  if  and  only  if  the 
dimension  ofW  is  also  n. 


From  this  theorem  follows  the  next  corollary. 


Corollary  9.84:  One  to  One  and  Onto  Characterization 


Let  T  :V  — >  W  be  a  linear  map  where  the  dimension  ofV  is  n  and  the  dimension  ofW  is  m.  Then 
T  is  one  to  one  if  and  only  if  ker(T)  =  <  0  >  and  T  is  onto  if  and  only  if  rank{T)  —  m. 


Proof.  The  statement  ker  ( T )  —  joj  is  equivalent  to  saying  if  T  (v)  =  0,  it  follows  that  v  =  0  .  Thus 

by  Lemma  9.64  T  is  one  to  one.  If  T  is  onto,  then  im(T)  =  W  and  so  rank(T)  which  is  defined  as  the 
dimension  of  im(T)  is  m.  If  rank(T)  =  m,  then  by  Theorem  9.83,  since  im(T)  is  a  subspace  of  W ,  it 
follows  that  im  (T)  —  W.  4k 


Solution.  You  may  recall  this  example  from  earlier  in  Example  9.65.  Here  we  will  determine  that  S  is  one 
to  one,  but  not  onto,  using  the  method  provided  in  Corollary  9.84. 

By  definition, 

ker(S)  =  { ax 2  +  bx  +  c  &  F  2  |  a  +  b  —  0,a  +  c  —  0,b  —  c  —  0,b  +  c  =  0}. 


Suppose  p{x)  =  ax 2  +  bx  +  c  G  ker(S) .  This  leads  to  a  homogeneous  system  of  four  equations  in  three 
variables.  Putting  the  augmented  matrix  in  reduced  row-echelon  form: 


'  1 

1 

0 

0  ' 

1 

0 

1 

0 

0 

1 

-1 

0 

0 

1 

1 

0 

1  0  0 
0  1  0 
0  0  1 
0  0  0 


0 

0 

0 

0 


Since  the  unique  solution  is  a  —  b  —  c  —  0,  ker(S)  =  {0},  and  thus  S  is  one-to-one  by  Corollary  9.84. 
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Similarly,  by  Corollary  9.84,  if  S  is  onto  it  will  have  rank(S)  =  dim(M22)  =  4.  The  image  of  S  is  given 


by 


im(S) 


a  ■  h  a  +  c 
b  —  c  b  +  c 


—  span 


1  1 
0  0 


1  0 
1  1 


0  1 

-1  1 


These  matrices  are  linearly  independent  which  means  this  set  forms  a  basis  for  im(S).  Therefore  the 
dimension  of  im(S),  also  called  rank(S),  is  equal  to  3.  It  follows  that  S  is  not  onto.  4k 


Exercises 


Exercise  9.8.1  Let  V  —  M3  and  let 

W  =  span  (S) ,  where  S 


2 

2 

-2 


X 

'  l 

l  ' 

X 

_  y  _ 

l 

l 

_  y  _ 

X 

'  l 

o ' 

X 

_  y  _ 

l 

i 

_  y  _ 

Find  a  basis  ofW  consisting  of  vectors  in  S. 

Exercise  9.8.2  Let  T  be  a  linear  transformation  given  by 

T 

Find  a  basisfor  ker  (T)  and  im(T). 

Exercise  9.8.3  Let  T  be  a  linear  transformation  given  by 

T 

Find  a  basis  for  ker  (T)  and  im(T). 

Exercise  9.8.4  Let  V  —  M3  and  let 

W  —  span ' 

Extend  this  basis  ofW  to  a  basis  ofV. 

Exercise  9.8.5  Let  T  be  a  linear  transformation  given  by 

T 


[ 

'  l ' 

"  -l  ' 

I 

1 

l 

2 

\ 

l 

l 

-1 

I 

X 

X 

Ill' 

y 

— 

1  1  1 

y 

_  z  _ 

_  z  _ 

1 

-1 

3 


What  is  dim  (ker  (7'))  ? 
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9.9  The  Matrix  of  a  Linear  Transformation 


Outcomes 


A.  Find  the  matrix  of  a  linear  transformation  with  respect  to  general  bases  in  vector  spaces. 


You  may  recall  from  Wl  that  the  matrix  of  a  linear  transformation  depends  on  the  bases  chosen.  This 
concept  is  explored  in  this  section,  where  the  linear  transformation  now  maps  from  one  arbitrary  vector 
space  to  another. 

Let  T  :  V  (->■  W  be  an  isomorphism  where  V  and  W  are  vector  spaces.  Recall  from  Lemma  9.74  that  T 
maps  a  basis  in  V  to  a  basis  in  W.  When  discussing  this  Lemma,  we  were  not  specific  on  what  this  basis 
looked  like.  In  this  section  we  will  make  such  a  distinction. 

Consider  now  an  important  definition. 


We  continue  with  another  related  definition. 


Definition  9.87:  Coordinate  Vector 


Let  V  be  a  finite  dimensional  vector  space  with  dim(V )  =  n,  and  let  B  =  ■  ■  ■  ,bn}  be  an 

ordered  basis  of  V  (meaning  that  the  order  that  the  vectors  are  listed  is  taken  into  account).  The 
coordinate  vector  ofv  with  respect  to  B  is  defined  as  Cg(v) . 


Consider  the  following  example. 
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Solution. 


1.  First,  note  the  order  of  the  basis  is  important.  Now  we  need  to  find  ai,a2,ci3  such  that.?  =  n'i(l)  + 
ci2(x )  +  a3(.v2),  that  is: 

—x2  —  2x  +  4  =  fli(l)  +«2(x)  +  a3(?2) 


Clearly  the  solution  is 


a  i=4 
a2  =  -2 
a3  =  -1 


Therefore  the  coordinate  vector  is 


Cb(x ) 


4 

-2 

-1 


2.  Again  remember  that  the  order  of  B  is  important.  We  proceed  as  above.  We  need  to  find  a\, 02,03 
such  that  x  —  a\(x2)  +a2(x)  +  a3(l),  that  is: 

—x2  —  2.v  +  4  =  «i(jt)  +a,2{x)  +  a3(  1) 


Here  the  solution  is 


<21  =  — 1 

a2  =  -2 

«3  —  4 


Therefore  the  coordinate  vector  is 

Cs(?) 


-1 

-2 

4 


3.  Now  we  need  to  find  a\,a2,aT,  such  that?  =  ci\(x-\-x2)  +  a2(x)  +03(4),  that  is: 

— x2  —  2x  +  4  =  ci\(x  +  x2)  +a2(x)  +  a3(4) 

=  fli(.v2)  +  (ai  +<32)W  +a3(4) 
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The  solution  is 


and  the  coordinate  vector  is 


a\  —  — 1 
a2  =  -1 
a3  =  1 


CB{x) 


-1 

-1 

1 


* 


Given  that  the  coordinate  transformation  CB  :  V  — >  R”  is  an  isomorphism,  its  inverse  exists. 


We  now  discuss  the  main  result  of  this  section,  that  is  how  to  represent  a  linear  transformation  with 
respect  to  different  bases. 

Let  V  and  W  be  finite  dimensional  vector  spaces,  and  suppose 

•  dim(V)  —n  and  .61  =  {b\,l>2,  ■  ■  ■  ,bn}  is  an  ordered  basis  of  V; 

•  dim(W)  =  m  and  B2  is  an  ordered  basis  of  W. 

Let  T  :  V  — »  W  be  a  linear  transformation.  If  V  —  R"  and  W  =  M"!,  then  we  can  find  a  matrix  A  so  that 
Ta  —  T.  For  arbitrary  vector  spaces  V  and  W,  our  goal  is  to  represent  T  as  a  matrix.,  i.e.,  find  a  matrix  A 
so  that  Ta  :  R'!  ->  R"'  and  TA  =  CBlTCB  ' . 

To  hnd  the  matrix  A: 


Ta  =  CbJCb\  implies  that  TACBl—CBlT, 


and  thus  for  any  v  G  V, 


Cfi2[7’(v)]=7'A[CSl(v)]=ACi,1(v). 
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Since  Cg,  (bj)  =  for  each  bj  e  B\,ACb1  (bj)  =  Aey,  which  is  simply  the  jth  column  of  A.  Therefore, 
the  jth  column  of  A  is  equal  to  Cg2  [T  (bj)] . 

The  matrix  of  T  corresponding  to  the  ordered  bases  B\  and  B2  is  denoted  MB2bx  (T)  and  is  given  by 
MB2Bl(T)=[cB2[T(bl)]  CB2[T(b2 )]  CB2[T(bn)\]. 

This  result  is  given  in  the  following  theorem. 


Theorem  9.90: 


Let  V  and  W  be  vectors  spaces  of  dimension  n  and  m  respectively,  with  B\  —  {b\,b 2,. . .  ,bn}  an 
ordered  basis  ofV  and  B2  an  ordered  basis  ofW.  Suppose  T  :  V  — »  W  is  a  linear  transformation. 
Then  the  unique  matrix  MBibx  (T)  ofT  corresponding  to  B\  and  B2  is  given  by 

MB2Bl(T)  =  [CB2[T(bl )]  CB2[T(b2)]  ■  CB2[T(bn)]]. 

This  matrix  satisfies  cb2  [T  (v)]  =  Mb2bx  (T)CBi  (v)  for  alive  V. 


We  demonstrate  this  content  in  the  following  examples. 


Solution.  To  find  Mb2bx  (T),  we  use  the  following  definition. 

MBlBjT)=[CBl[T(V )]  CB2[T(V)\  CBi[T(x)}  Csjrfjr2)]  ] 
First  we  find  the  result  of  applying  T  to  the  basis  B\. 


'  1  ' 

'  1 ' 

0 

"  0  ' 

T(.r)  = 

0 

0 

■  T(V)  = 

1 

0 

,T(x)  = 

-1 

1 

,ni)= 

0 

1 

1 

0 

0 

1 
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Next  we  apply  the  coordinate  isomorphism  Cg2  to  each  of  these  vectors.  We  will  show  the  first  in 
detail. 


/ 

'  1 ' 

\ 

"  1  ' 

'  0  ' 

0  ' 

'  0  ' 

0 

0 

1 

0 

0 

0 

—  o\ 

1 

+  Cl2 

0 

+  <23 

-1 

+  <24 

0 

V 

1 

) 

0 

0 

0 

1 

This  implies  that 

Q\  —  1 

a2  =  0 

ai— a3  —  0 

a  4  =  1 

which  has  a  solution  given  by 

ai  —  1 

a2  —  0 
a3  =  1 

<24  =  l 


Therefore  Cg2  [T  (x3)] 


1 

0 

1 

1 


You  can  verify  that  the  following  are  true. 


Cb2[T(x2)} 


'  1 ' 

0  ' 

0  ' 

1 

■  Cs![7'W]  = 

-1 

.c«![r(i)]  = 

0 

1 

-1 

-1 

0 

0 

1 

Using  these  vectors  as  the  columns  of  MBlBl  (T)  we  have 


MBlB\  CO 


110  0 
0  1-1  0 

11-1-1 
10  0  1 


* 

The  next  example  demonstrates  that  this  method  can  be  used  to  solve  different  types  of  problems.  We 
will  examine  the  above  example  and  see  if  we  can  work  backwards  to  determine  the  action  of  T  from  the 
matrix  MBlBl{T). 
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Solution.  Recall  that  Cg2[T(p(jc))]  =  Mb2Bi(T)Cb1(p(x)).  Then  we  have 


Cb,  [T(p(x))]  =  Mb,b,{T)CbMx )) 


11  0  o' 

a 

0  1-1  0 

b 

11-1-1 

c 

10  0  1 

d 

a  +  b 
b  —  c 

a+b—c—d 
a  +  d 


Therefore 

T{P(* )) 


C 


i) 


a  +  b 
b  —  c 

a+b — c — d 
a  +  d 


1  ' 

'  0  ' 

0  ' 

'  0  " 

0 

+  {b  —  c) 

1 

+  (a  +  b  —  c  —  d) 

0 

+  {a  +  d) 

0 

1 

0 

-1 

0 

0 

0 

0 

1 

a  +  b 
b  —  c 
c  +  d 
a  +  d 


You  can  verify  that  this  was  the  definition  of  T (p(x))  given  in  the  previous  example. 


4 
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We  can  also  find  the  matrix  of  the  composite  of  multiple  transformations. 


Theorem  9.93:  Matrix  of  Composition 


LetV,W  and  U  be  Unite  dimensional  vector  spaces,  and  suppose  T  :V  ^W,  S  :  W  U  arc  linear 
transformations.  Suppose  V,  W  and  U  have  ordered  bases  of  B\,  B2  and  B3  respectively.  Then  the 
matrix  of  the  composite  transformation  S  o  T  (or  ST)  is  given  by 

Mb3Bi(ST)  —  Mb3b2(S)Mb2Bi(T). 


The  next  important  theorem  gives  a  condition  on  when  T  is  an  isomorphism. 


Theorem  9.94:  Isomorphism 


Let  V  and  W  be  vector  spaces  such  that  both  have  dimension  n  and  let  T  :  V  1— >  W  be  a  linear 
transformation.  Suppose  B\  is  an  ordered  basis  ofV  and  B2  is  an  ordered  basis  ofW. 

Then  the  conditions  thatMB2Bl(T)  is  invertible  for  all  B\  andB2,  and  that  Mb2bx(T)  is  invertible  for 
some  B\  andB2  are  equivalent.  In  fact,  these  occur  if  and  only  if  T  is  an  isomorphism. 

IfT  is  an  isomorphism,  the  matrix  Mb2bx  (T)  is  invertible  and  its  inverse  is  given  by  \Mb2b{  (T)]~  1  = 
Mb,b,{T-'). 


Consider  the  following  example. 


Solution. 
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1. 


Mb2b,(t)  =  [  cSl[r(i)]  cSj[rw]  cyn*2)]  cyn*3)]  ] 


'10' 

0  1' 

o 

Cb2 

0  1 

Cb2 

1  0 

Cb2 

1  0 

Cb2 

10  0  1 

0  1-10 
0  110 
10  o  -1  _ 

2.  det(Mfi2gj  ('/’))  =  4,  so  the  matrix  is  invertible,  and  hence  T  is  an  isomorphism. 


3. 


T  1 

o 

=  i  ,r_1 

o 

=  x,T  x 

i 

■  o 

,  1 

=  x2, 

i - 

o 

1 _ 

0  1 

1  0 

1  0 

c 

I 

so 


Therefore, 


l 

'  1 

0  ' 

_  1+*3  r-t 

'  0 

1  ' 

1 

x 

1 

x 

to 

0 

0 

2  ’ 

0 

0 

2 

l 

'  0 

0  ' 

*  +  *2  rr-l 

'  0 

1  ' 

1  —  X3 

1 

0 

2  ’ 

0 

0 

2 

MbMT-') 


1 

2 


1 

0 

0 

1 


0  0  1 

1  1  0 

-1  1  0 

0  0-1 


You  should  verify  that MBlBl(T)MBlB2(T  1)=h ■  From  this  it  follows  that  [MB^Bl  (T)]  1 


4. 


CbAT 


p  q 

r  s 

p  q 

r  s 


—  MBib2{T  1)Cb 

1  f  MBlB2(T-l)CB 


=  C 


B 1 


/ 


=  c 


B 1 


/ 


=  c 


B 1 


V 


1 


1 

0 

0 

1 


0  0 
1  1 
1  1 
0  0 

\ 


P  +  S 

q+r 
r-q 

P-s  J  / 

1 


q 

s 


p 

r 


1 

0 

0 

-1 


q 

s 


P 

\ 

q 

r 

s 

) 

1 


1 


^(p  +  s)xi  +  ^(q  +  r)x2  +  ^(r-q)x+^(p-s) 


=  MBiB2(T~1). 


4 
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Exercises 


Exercise  9.9.1  Consider  the  following  functions  which  map  Rn  to  Wl. 

(a)  T  multiplies  the  /"  component  of  x  by  a  nonzero  number  b. 

(b)  T  replaces  the  ith  component  of  x  with  b  times  the  jth  component  added  to  the  ith  component. 

(c)  T  switches  the  ith  and  jth  components. 

Show  these  functions  are  linear  transformations  and  describe  their  matrices  A  such  that  T  (x)  —  Ax. 

Exercise  9.9.2  You  are  given  a  linear  transformation  T  :  W1  — )■  R"5  and  you  know  that 

T  (Aj)  =  Bj 

where  [  A\  •••  An  ]  1  exists.  Show  that  the  matrix  of  T  is  of  the  form 

[  Bl  ■■■  Bn  ]  [  A\  •••  An  ]_1 


Exercise  9.9.3  Suppose  T  is  a  linear  transformation  such  that 


1  ' 

'  5  ' 

T 

2 

= 

1 

.  ~6  . 

_  3  _ 

'  -1  ' 

'  1  ' 

T 

-1 

= 

1 

5  _ 

_  5  _ 

0  ' 

5 

T 

-1 

= 

3 

2 

-2 

Find  the  matrix  ofT.  That  is  find  A  such  that  T  (x)  —  Ax. 
Exercise  9.9.4  Suppose  T  is  a  linear  transformation  such  that 


1  ' 

'  1  ' 

1 

= 

3 

.  “8  . 

1 

'  -1  ' 

'  2  ' 

0 

4 

6 

1 

T 
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Find  the  matrix  ofT.  That  is  find  A  such  that  T  (3c)  —  Ax. 

Exercise  9.9.5  Suppose  T  is  a  linear  transformation  such  that 

1  1  [  -3  " 

T  3  1 

_  -i  \  [  3 . 

'  -i  i  r  i " 

T  -2  3 

6  J  L-3 


Find  the  matrix  ofT.  That  is  find  A  such  that  T  (3c)  =  Ax. 
Exercise  9.9.6  Suppose  T  is  a  linear  transformation  such  that 


Find  the  matrix  ofT.  That  is  find  A  such  that  T  (3c)  =  Ax. 
Exercise  9.9.7  Suppose  T  is  a  linear  transformation  such  that 


°  1  r  2  ' 

T  - 1  =  5 

4  J  L-2 
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Find  the  matrix  ofT.  That  is  find  A  such  that  T  (x)  =  Ax. 

Exercise  9.9.8  Consider  the  following  functions  T  :  R3  — »  R2.  Show  that  each  is  a  linear  transformation 
and  determine  for  each  the  matrix  A  such  that  T  (x)  =  Ax. 


(a)  T 


x 

y 

z 


x  +  2y  +  3z 
2 y  -3  x  +  z 


(b)  T 


x 

y 

z 


lx  +  2y  +  Z 
3x  —  1  ly  +  2z 


(c)  T 


x 

y 

z 


3x  +  2y  +  z 
x  +  2y  +  6z 


( d)  T 


x 

y 

z 


2y-5x  +  z 
x  +  y  +  z 


Exercise  9.9.9  Consider  the  following  functions  T  :  R3  — y  R2.  Explain  why  each  of  these  functions  T  is 
not  linear. 


(a)  T 


x 

y 

z 


x  +  2y  +  3z+  1 
2y  —  3x  +  z 


(b)  T 


x 

y 

z 


x  +  2y2  +  3z 
2y  +  3x-\-z 


(c)  T 


x 

y 

z 


sin*  +  2 y  +  3  z 
2y  +  3x  +  z 


( d)  T 


x 

y 

z 


x  +  2y  +  3z 
2y  +  3x  —  lnz 


Exercise  9.9.10  Suppose 

[  A]  •••  An  f1 

exists  where  each  Aj  e  R"  and  let  vectors  {B\,  ■  ■  ■  ,Bn}  in  R"7  be  given.  Show  that  there  always  exists  a 
linear  transformation  T  such  that  T(Aj )  =  If. 
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T 

Exercise  9.9.11  Find  the  matrix  for  T  (vv)  =  proj^(w)  where  v  =  [  1  —2  3  ] 

Exercise  9.9.12  Find  the  matrix  for  T  (w)  =  proj$(w)  where  v  =  [15  3]. 

Exercise  9.9.13  Find  the  matrix  for  T  (vv)  =  proj^(w)  where  v  =  [  1  0  3  ] 7  . 


Exercise  9.9.14  Let  B 

Cg(x). 


be  a  basis  of  Mr  and  let  x  — 


be  a  vector  in  M2.  Find 


Exercise  9.9.15  Let  B 

in  R2.  Find Cg(x). 


1 1  r  2 1  r  -i  i  \  r  5 

—  1,1,  0  \  be  a  basis  of  M?  and  let  x  =  —1  be  a  vector 

2JL2JL2JJ  L4. 


Exercise  9.9.16  Let  T  :  M2  R2  be  a  linear  transformation  defined  by  T 


Consider  the  two  bases 


and 


a 

b 


Find  the  matrix  Mg2,Bi  ofT  with  respect  to  the  bases  B\  and 


a  +  b 
a  —  b 


A.  Some  Prerequisite  Topics 


The  topics  presented  in  this  section  are  important  concepts  in  mathematics  and  therefore  should  be  exam¬ 
ined. 


A.l  Sets  and  Set  Notation 


A  set  is  a  collection  of  things  called  elements.  For  example  {1,2, 3,8}  would  be  a  set  consisting  of  the 
elements  1,2,3,  and  8.  To  indicate  that  3  is  an  element  of  { 1, 2, 3, 8} ,  it  is  customary  to  write  3  G  { 1, 2, 3, 8} . 
We  can  also  indicate  when  an  element  is  not  in  a  set,  by  writing  9  ^  { 1, 2, 3, 8}  which  says  that  9  is  not  an 
element  of  { 1, 2, 3, 8} .  Sometimes  a  rule  specifies  a  set.  For  example  you  could  specify  a  set  as  all  integers 
larger  than  2.  This  would  be  written  as  S  —  {x  G  Z  :  x  >  2}  .  This  notation  says:  S  is  the  set  of  all  integers, 
x,  such  that  x  >  2. 

Suppose  A  and  B  are  sets  with  the  property  that  every  element  of  A  is  an  element  of  B.  Then  we 
say  that  A  is  a  subset  of  B.  For  example,  {1,2, 3, 8}  is  a  subset  of  {1,2, 3, 4, 5, 8}.  In  symbols,  we  write 
{1,2, 3, 8}  C  {1,2, 3, 4, 5, 8}.  It  is  sometimes  said  that  “A  is  contained  in  B”  or  even  “B  contains  A".  The 
same  statement  about  the  two  sets  may  also  be  written  as  {1, 2, 3, 4, 5, 8}  D  {1,2, 3, 8}. 

We  can  also  talk  about  the  union  of  two  sets,  which  we  write  as  A  U  B.  This  is  the  set  consisting  of 
everything  which  is  an  element  of  at  least  one  of  the  sets,  A  or  B.  As  an  example  of  the  union  of  two  sets, 
consider  {1,2,3,8}U{3,4,7,8}  =  {1,2, 3, 4, 7, 8}.  This  set  is  made  up  of  the  numbers  which  are  in  at  least 
one  of  the  two  sets. 

In  general 

A  U  B  —  {x  :  x  G  A  or  x  E  B} 

Notice  that  an  element  which  is  in  both  A  and  B  is  also  in  the  union,  as  well  as  elements  which  are  in  only 
one  of  A  or  B. 

Another  important  set  is  the  intersection  of  two  sets  A  and  B,  written  A  (IB.  This  set  consists  of 
everything  which  is  in  both  of  the  sets.  Thus  {1, 2, 3, 8}  D  {3, 4, 7, 8}  =  {3,8}  because  3  and  8  are  those 
elements  the  two  sets  have  in  common.  In  general, 

ADB  =  {x :  x  &A  and  x  E  B} 

If  A  and  B  are  two  sets,  A  \  B  denotes  the  set  of  things  which  are  in  A  but  not  in  B.  Thus 

A\B  —  {x  E  A  :  x  ^  B} 

For  example,  if  A  =  {1,2,3, 8}  and  B  =  {3, 4, 7, 8},  then  A\B  —  {1,2, 3, 8}  \  {3, 4, 7, 8}  =  {1,2}. 

A  special  set  which  is  very  important  in  mathematics  is  the  empty  set  denoted  by  0.  The  empty  set,  0, 
is  defined  as  the  set  which  has  no  elements  in  it.  It  follows  that  the  empty  set  is  a  subset  of  every  set.  This 
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is  true  because  if  it  were  not  so,  there  would  have  to  exist  a  set  A,  such  that  0  has  something  in  it  which  is 
not  in  A.  However,  0  has  nothing  in  it  and  so  it  must  be  that  0  C  A. 

We  can  also  use  brackets  to  denote  sets  which  are  intervals  of  numbers.  Let  a  and  b  be  real  numbers. 
Then 


•  [a,  b\  —  {x  G  R  :  a  <  x  <  b} 

•  [a,  b)  —  {x  G  M  :  a  <  x  <  b} 

•  (a,  b)  —  {x  G  M  :  a  <  x  <  b} 

•  (a,  b]  —  {x  G  M  :  a  <  x  <  b} 

•  [a,°o)  =  {x  G  K. :  x  >  a} 

•  (— °°,a]  =  {x  G  R  :  x  <  a} 

These  sorts  of  sets  of  real  numbers  are  called  intervals.  The  two  points  a  and  b  are  called  endpoints, 
or  bounds,  of  the  interval.  In  particular,  a  is  the  lower  bound  while  b  is  the  upper  bound  of  the  above 
intervals,  where  applicable.  Other  intervals  such  as  (— ■ °°,b)  are  defined  by  analogy  to  what  was  just 
explained.  In  general,  the  curved  parenthesis,  (,  indicates  the  end  point  is  not  included  in  the  interval, 
while  the  square  parenthesis,  [,  indicates  this  end  point  is  included.  The  reason  that  there  will  always  be 
a  curved  parenthesis  next  to  °°  or  —  °o  is  that  these  are  not  real  numbers  and  cannot  be  included  in  the 
interval  in  the  way  a  real  number  can. 

To  illustrate  the  use  of  this  notation  relative  to  intervals  consider  three  examples  of  inequalities.  Their 
solutions  will  be  written  in  the  interval  notation  just  described. 


Example  A.l:  Solving  an  Inequality 


Solve  the  inequality  2x  +  4  <  x  —  8. 


Solution.  We  need  to  find  x  such  that  2x  +  4  <  x  —  8.  Solving  for  x ,  we  see  that  x  <  — 12  is  the  answer. 
This  is  written  in  terms  of  an  interval  as  (— °°,  — 12].  4* 

Consider  the  following  example. 


Solution.  We  need  to  find  x  such  that  (x+  1)  (2x  —  3)  >  0.  The  solution  is  given  by  x  <  -«1  or  x  >  |. 
Therefore,  x  which  fit  into  either  of  these  intervals  gives  a  solution.  In  terms  of  set  notation  this  is  denoted 
by  (— oo,— i]u[|,oo).  4k 


Consider  one  last  example. 
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Solution.  This  inequality  is  true  for  any  value  of  x  where  x  is  a  real  number.  We  can  write  the  solution  as 
M.  or  (— oo5oo) .  4jt 

In  the  next  section,  we  examine  another  important  mathematical  concept. 


A.2  Well  Ordering  and  Induction 


We  begin  this  section  with  some  important  notation.  Summation  notation,  written  Y.j=\  U  represents  a  sum. 
Here,  i  is  called  the  index  of  the  sum,  and  we  add  iterations  until  i  —  j.  For  example, 


Another  example: 


52 1  ~  1  +  2  H - F  j 

i=  1 


3 


tfll  +an  +013  =  52  ai' 
i=l 


The  following  notation  is  a  specific  use  of  summation  notation. 


Notice  that  since  addition  is  commutative,  Tfj=i  E/=i  aij  =  E[=i  E‘/=i  aij- 

We  now  consider  the  main  concept  of  this  section.  Mathematical  induction  and  well  ordering  are  two 
extremely  important  principles  in  math.  They  are  often  used  to  prove  significant  things  which  would  be 
hard  to  prove  otherwise. 


Definition  A.5:  Well  Ordered 


A  set  is  well  ordered  if  every  nonempty  subset  S,  contains  a  smallest  element  z  having  the  property 
that  z<x  for  all  x  G  S. 
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In  particular,  the  set  of  natural  numbers  defined  as 

N  =  {1,2,---} 


is  well  ordered. 

Consider  the  following  proposition. 


Proposition  A.6:  Well  Ordered  Sets 


Any  set  of  integers  larger  than  a  given  number  is  well  ordered. 


This  proposition  claims  that  if  a  set  has  a  lower  bound  which  is  a  real  number,  then  this  set  is  well 
ordered. 

Further,  this  proposition  implies  the  principle  of  mathematical  induction.  The  symbol  Z  denotes  the 
set  of  all  integers.  Note  that  if  a  is  an  integer,  then  there  are  no  integers  between  a  and  a  +  1 . 


Theorem  A.7:  Mathematical  Induction 


A  set  SC.  Z,  having  the  property  that  a  <G  S  and  n  +  1  e  S  whenever  n  e  S,  contains  all  integers  x  e  Z 
such  thatx  >  a. 


Proof.  Let  T  consist  of  all  integers  larger  than  or  equal  to  a  which  are  not  in  S.  The  theorem  will  be  proved 
if  T  —  II).  If  T  /  0  then  by  the  well  ordering  principle,  there  would  have  to  exist  a  smallest  element  of  T, 
denoted  as  b.  It  must  be  the  case  that  b  >  a  since  by  definition,  a  £T.  Thus  b>a+l,  and  so  b  —  1  >  a  and 
b  —  1  ^  S  because  if  b  —  1  6  S,  then  b  —  1  +  1  =  b  e  S  by  the  assumed  property  of  S.  Therefore,  b  —  1  G  T 
which  contradicts  the  choice  of  b  as  the  smallest  element  of  T.  (b  —  1  is  smaller.)  Since  a  contradiction  is 
obtained  by  assuming  T  ^  0,  it  must  be  the  case  that  T  —  0  and  this  says  that  every  integer  at  least  as  large 
as  a  is  also  in  S.  4k 

Mathematical  induction  is  a  very  useful  device  for  proving  theorems  about  the  integers.  The  procedure 
is  as  follows. 


Procedure  A.8:  Proof  by  Mathematical  Induction 


Suppose  Sn  is  a  statement  which  is  a  function  of  the  number  n,  forn  —  1,2,  ■  •  • ,  and  we  wish  to  show 
that  Sn  is  true  for  all  n>  1 .  To  do  so  using  mathematical  induction,  use  the  following  steps. 

1.  Base  Case:  Show  Si  is  true. 

2.  Assume  Sn  is  true  for  some  n,  which  is  the  induction  hypothesis.  Then,  using  this  assump¬ 
tion,  show  that  Sn+\  is  true. 

Proving  these  two  steps  shows  that  Sn  is  true  for  alln =  1 , 2,  •  •  • . 


We  can  use  this  procedure  to  solve  the  following  examples. 
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Solution.  By  Procedure  A. 8,  we  first  need  to  show  that  this  statement  is  true  for  n  —  1.  When  n  =  1,  the 
statement  says  that 


I*2 


1(1  +  1)(2(1)+1) 

=  - « - 

6 

6 

=  1 

The  sum  on  the  left  hand  side  also  equals  1,  so  this  equation  is  true  for  n  —  1. 

Now  suppose  this  formula  is  valid  for  some  n  >  1  where  n  is  an  integer.  Hence,  the  following  equation 
is  true. 


Y^k2=  w(w+l)  (2n  +  l) 


(1.1) 


k=\ 

We  want  to  show  that  this  is  true  for  n  +  1. 

Suppose  we  add  (n  +  l)2  to  both  sides  of  equation  1.1. 

n+1  n 

^k2  —  ^ k2  +  (n  +  l)~ 

k=  1  k=  1 

/7  (n  +  1 )  (2 n  +  1) 


+  («  +  !)" 


The  step  going  from  the  first  to  the  second  line  is  based  on  the  assumption  that  the  formula  is  true  for  n. 
Now  simplify  the  expression  in  the  second  line, 


This  equals 


and 


Therefore, 


',("  +  l)(2'1ti)  +  („  +  1)2 


(71+1) 


6 

n  (2 n  +  1) 


f  (n+  1) 


n(2n  +  l)  ,  ,  6(n+ 1) +2ii2  +  ii  (n  +  2)  (2n  +  3) 

- -  +  (/7  +  1)  =  — - - =  2 - — - - 


1 2  (n  +  1)  (i7  +  2)  (2/i  +  3)  (n  +  1)  ((/i  +  1)  +  1)  (2  (n  +  1)  +  1) 

k  =  «  =  6 

showing  the  formula  holds  for  n  +  1  whenever  it  holds  for  n.  This  proves  the  formula  by  mathematical 
induction.  In  other  words,  this  formula  is  true  for  all  n  =  1, 2,  •  •  ■ .  4k 


Consider  another  example. 
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Solution.  Again  we  will  use  the  procedure  given  in  Procedure  A. 8  to  prove  that  this  statement  is  true  for 
all  n.  Suppose  n  =  1.  Then  the  statement  says 


1  1 

2<7! 


which  is  true. 

Suppose  then  that  the  inequality  holds  for  n.  In  other  words, 

13  2/7  —  1  1 

2 ' 4  2n  <  v/2n+T 


is  true. 

Now  multiply  both  sides  of  this  inequality  by  This  yields 

13  2n  —  1  2n  + 1  1  2/7  +  1  y/2n+ T 

2  4  2/7  2/7  +  2  <  ^/2n+  1  2/7  +  2  2//  +  2 

The  theorem  will  be  proved  if  this  last  expression  is  less  than  .  This  happens  if  and  only  if 

+  2/7  +  3 

/  1  \2_  1  2/7  +  1 

V7^+V  “  2/7  +  3  >  (2/7 +  2)2 

which  occurs  if  and  only  if  (2/7  +  2)“  >  (2/7  +  3)  (2/7  +1)  and  this  is  clearly  true  which  may  be  seen  from 
expanding  both  sides.  This  proves  the  inequality.  4k 

Let’s  review  the  process  just  used.  If  S  is  the  set  of  integers  at  least  as  large  as  1  for  which  the  formula 
holds,  the  first  step  was  to  show  1  e  S  and  then  that  whenever  n  6  S,  it  follows  n  +  1  e  5.  Therefore,  by 
the  principle  of  mathematical  induction,  S  contains  [1,°°)  DZ,  all  positive  integers.  In  doing  an  inductive 
proof  of  this  sort,  the  set  S  is  normally  not  mentioned.  One  just  verifies  the  steps  above. 


B.  Selected  Exercise  Answers 


111  4^  =  3  ’  Solution  is:  [x  =  . 

1.1.2  'V  ^  ,  Solution  is:  [x—  l,y  =  0] 

121  4^_3y  I  3  ’  Solution  is:  [x  =  |f  ,y  = 

122  Ix-y  I  3  ’  Solution  is:  [x  =  { §,y  = 


x  +  2  y  =  1 

1.2.3  2x  —  y=l  ,  Solution  is:  [x  =  |,y  =  3] 
4x  +  3y  =  3 


1.2.4  No  solution  exists.  You  can  see  this  by  writing  the  augmented  matrix  and  doing  row  operations. 


"  1 

1 

-3 

2  ' 

'  1 

0 

4 

0  ' 

2 

1 

1 

1 

,  row  echelon  form: 

0 

1 

-7 

0 

3 

2 

-2 

0 

0 

0 

0 

1 

.  Thus  one  of  the  equations  says  0  =  1  in  an 


equivalent  system  of  equations. 


4g-/=  150 

1.2.5  41 7  11  g  =  ~^6°  ,  Solution  is  :  (g  =  60,7  -  90,  b  =  200,  .9  =  50} 
4  g  +  s  —  290  c  J 

g+I+s—b— 0 


1.2.6  The  solution  exists  but  is  not  unique. 


1.2.7  A  solution  exists  and  is  unique. 

1.2.9  There  might  be  a  solution.  If  so,  there  are  infinitely  many. 


1.2.10  No.  Consider  x  +  y  +  z  —  2  and. x+y  +  z  —  1. 


1.2.11  These  can  have  a  solution.  For  example,  x+y  —  l,2x  +  2y  =  2,3x  +  3y  —  3  even  has  an  infinite  set 
of  solutions. 


1.2.12  h  =  4 
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1.2.13  Any  h  will  work. 

1.2.14  Any  h  will  work. 

1.2.15  If  h  /  2  there  will  be  a  unique  solution  for  any  k.  If  h  =  2  and  k  ^  4,  there  are  no  solutions.  If  h  =  2 
and  k  =  4,  then  there  are  infinitely  many  solutions. 

1.2.16  If  h  /  4,  then  there  is  exactly  one  solution.  I !'  h  =  4  and  k  /  4,  then  there  are  no  solutions.  If  h  —  4 
and  k  =  4,  then  there  are  infinitely  many  solutions. 


1.2.17  There  is  no  solution.  The  system  is  inconsistent.  You  can  see  this  from  the  augmented  matrix. 


"  1 

2  1  -1 

2  " 

"  1 

0 

0 

1 

3 

0  ' 

1 

-1  1  1 

1 

,  reduced  row-echelon  form: 

0 

1 

0 

2 

3 

0 

2 

1  -1  0 

1 

0 

0 

1 

0 

0 

4 

2  1  0 

5  _ 

_  0 

0 

0 

0 

1  _ 

1.2.18  Solution  is:  w  —  \y  —  \,x— \  —  \y,z—  \ 


1.2.19  (a)  This  one  is  not. 

(b)  This  one  is. 

(c)  This  one  is. 


1.2.28  The  reduced  row-echelon  form  is 
t,y=\  +  t{^),x=\  —  \t  where  tel. 

1.2.29  The  reduced  row-echelon  form  is 

1.2.30  The  reduced  row-echelon  form  is 
lt,x2  —  At,x\  =3  —  9 1. 

1.2.31  The  reduced  row-echelon  form  is 

the  other  variables  are  given  by  X4  =  —  \  - 


"  1 

0 

1  1  1  1 

2  1  2 

0 

1 

— 

1  I  3 

4  1  4 

_  0 

0 

0  1  0  J 

1 

0 

4 

2  ' 

0 

1 

-4 

-1 

1 

0 

0 

0  9 

0 

1 

0 

^3- 

1 

0 

0 

0 

1 

0 

1 

-j 

0 

0 

0 

1  6 

1 

0 

2 

— <I<N 

1 

0 

0 

1 

0 

0  \ 

0 

0 

0 

1  I 

0 

0 

0 

0  0 

.  Therefore,  the  solution  is  of  the  form  z  — 


and  so  the  solution  is  z  — t,y  — 4t,x  — 2  —  At. 


3 

0 

-1 

1 


and  S0.V5  =  t,x 4  —  1  —  6t,xs  =  —  1  + 


Therefore,  let  *5  —t,x 3  =  5.  Then 


%t,x 2  =  \  —  t\,,x\  =  %  +  \t  —  2s. 
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1.2.32  Solution  is:  [x  =  1  —  2 t,z  =  l,y  =  /] 

1.2.33  Solutionis:  [x  =  2  —  4t,y  =  —  &t,z  =  t] 

1.2.34  Solutionis:  [x  —  —  l,y  —  2,z  —  —  1] 

1.2.35  Solutionis:  [x  —  2,y  =  4,z  =  5] 

1.2.36  Solution  is:  [x  —  1  ,y  —  2,z  =  —5] 

1.2.37  Solutionis:  [x  =  —  l,y  =  —  5,z  =  4] 

1.2.38  Solutionis:  [x  =  2t +l,y  =  4t,z  =  t] 

1.2.39  Solution  is:  [x=  1  ,y  =  5 ,z  —  3] 

1.2.40  Solutionis:  [x  —  4,y  =  — 4,z  —  —  2] 

1.2.41  No.  Consider x  +  y  +  z  =  2  and x  +  y  +  z—  1 . 

1.2.42  No.  This  would  lead  to  0  =  1 . 

1.2.43  Yes.  It  has  a  unique  solution. 

1.2.44  The  last  column  must  not  be  a  pivot  column.  The  remaining  columns  must  each  be  pivot  columns. 

^  (20  +  30  +  w  +  x)  — y  =  0 

1.2.45  You  need  f  ^  ^ ^  ,  Solution  is:  [w  =  15, x  =  15, y  =  20 ,z—  10] . 

5  (20  +  y  +  z+  10) -x  =  0 

|  (x  + w  +  0+  10)  —  z  =  0 

1.2.57  It  is  because  you  cannot  have  more  than  min  ( m,n )  nonzero  rows  in  the  reduced  row-echelon  form. 
Recall  that  the  number  of  pivot  columns  is  the  same  as  the  number  of  nonzero  rows  from  the  description 
of  this  reduced  row-echelon  form. 


1.2.58  (a)  This  says  B  is  in  the  span  of  four  of  the  columns.  Thus  the  columns  are  not  independent. 

Infinite  solution  set. 

(b)  This  surely  can’t  happen.  If  you  add  in  another  column,  the  rank  does  not  get  smaller. 

(c)  This  says  B  is  in  the  span  of  the  columns  and  the  columns  must  be  independent.  You  can’t  have  the 
rank  equal  4  if  you  only  have  two  columns. 

(d)  This  says  B  is  not  in  the  span  of  the  columns.  In  this  case,  there  is  no  solution  to  the  system  of 
equations  represented  by  the  augmented  matrix. 
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(e)  In  this  case,  there  is  a  unique  solution  since  the  columns  of  A  are  independent. 

1.2.59  These  are  not  legitimate  row  operations.  They  do  not  preserve  the  solution  set  of  the  system. 
1.2.62  The  other  two  equations  are 


6/3-6/4+/3+/3  +  5/3-5/2  =  -20 

2/4  +  3/4  T  6/4  —  6/3  +I4  —  /]  =  0 


Then  the  system  is 


The  solution  is: 


1.2.63  You  have 


2A  —  2/j  +  5A  —  5/3  +  3/2  =  5 

4/i  +  h  ~  h  +  2/i  -  2/2  =  - 1 0 

6/3  —  6/4  4~  /3  +  /3  +  5/3  —  5A  =  — 20 

2/4  T  3/4  T  6/4  —  6/3  +I4  —  1 1  =  0 


h 


h 


h 


h 


750 

373 

1421 

1119 

3061 

1119 

1718 

1119 


2/i  +  5/i  +  3/i  -  5/2  =  10 
72-/3  +  3/2  +  7/24-5/2-5/]  =  -12 
2/3  4~  4/3  4~  4/3  4 ~  I3  — 12  =  0 


Simplifying  this  yields 


10/] -5/2  =  10 
-5/1  +  I6/2-/3  =  -12 
-/2  +  H/3  =  0 


The  solution  is  given  by 

_  218  154 

7l  “  295’/2  “  ^ 295 ,/3  ~ 

2.1.3  To  get  —A,  just  replace  every  entry  of  A  with  its  additive 
all  zeros  in  it. 

2.1.5  Suppose  B  also  works.  Then 

—A  =  —A  +  (A  +  B)  =  (-A  +  A)  +B  =  0  +  B  =  B 


14 

295 

inverse.  The  0  matrix  is  the  one  which  has 


2.1.6  Suppose  O'  also  works.  Then  O'  =  O'  +  0  =  0. 

2.1.7  OA  =  (0  +  0)  A  =  OA  +  OA.  Now  add  —  (OA)  to  both  sides.  Then  0  =  OA. 
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2.1.8  A  +  (— 1)A  =  (1  +  (— 1))A  =  OA  =  0.  Therefore,  from  the  uniqueness  of  the  additive  inverse  proved 
in  the  above  Problem  2.1.5,  it  follows  that  —A  =  (— 1)A. 


2.1.9  (a) 


-3  -6  -9 

-6  -3  -21 


(b) 


-5 

5 


3 

-4 


(c)  Not  possible 


(d) 


-3  3  4 

6-17 


(e)  Not  possible 


(f)  Not  possible 


2.1.10  (a) 


-3  -6 
-9  -6 
-3  3 


(b)  Not  possible. 


(c) 


11  2 
13  6 
-4  2 


(d)  Not  possible. 


(e) 


7 

9 

-2 


(f)  Not  possible. 


(g)  Not  possible. 


2.1.11  (a) 


3  0-4 
-4  1  6 

5  1  -6 


(b) 
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(c)  Not  possible 


(d) 


-4 

-5 

-1 


-6 

-3 

-2 


(e) 


8  1  -3 
7  6-6 


2.1.12 


"  -1  -1 ' 

x  y 

—x  —  z  —w  —  y 

3  3 

z  w 

3x  +  3  z  3  w  +  3  y 

0  0 
0  0 


Solution  is:  w  —  —  y,x  =  —  z  so  the  matrices  are  of  the  form 


x  y 
- X  -y 


2.1.13  XtY  = 


0  -1  -2 

0  -1  -2 

0  1  2 


,XY7 


1 


2.1.14 

'  1 
3 

'  1 
3 


2  ' 

"12' 

7  2k  +  2  ' 

4 

3  k 

15  4k +  6 

2  ' 

"12" 

7  10 

k 

3  4 

3k  T  3  4k  -|-  6 

Thus  you  must  have 


3k  +  3 
2k  +  2 


15 
10  ’ 


Solution  is:  [k  =  4] 


2.1.15 


"12" 

"12" 

'  3  2k +  2  ' 

3  4 

1  k 

7  4k +  6 

"12" 

"12" 

7  10 

1  k 

3  4 

3k+l  4k +  2 

However,  7  ^  3  and  so  there  is  no  possible  choice  of  k  which  will  make  these  matrices  commute. 


1  -1 

-1  1 


,B  = 


1  1 
1  1 


,C  = 


2  2 
2  2 


1 

-1  ' 

"  1 

1  ' 

'  0 

0  ' 

-1 

1 

1 

1 

0 

0 

1 

-1  " 

'  2 

2  ' 

'  0 

0  ' 

-1 

1 

2 

2 

0 

0 

2.1.16  Let  A  = 
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2.1.18  Let  A 


1  1 
1  1 


1  -1 ' 

'  1 

1 ' 

'  0 

0  ' 

-1  1 

1 

1 

0 

0 

2.1.20  Let  A 


0  1 
1  0 


,B 


1  2 
3  4 


1 

O 

'12' 

u> 

-1^ 

O 

i _ 

3  4 

L 1  2  J 

r i  2i 

- 1 

o 

<N 

u> 

-1^ 

1 _ 

1 - 

o 

4  3  _ 

2.1.21  A 


1-12  0 
1  0  2  0 

0  0  3  0 

1  3  0  3 


2.1.22  A 


13  2  0 
10  2  0 
0  0  6  0 
13  0  1 


2.1.23  A 


1110 
112  0 
-10  10 
10  0  3 


2.1.26  (a)  Not  necessarily  true. 

(b)  Not  necessarily  true. 

(c)  Not  necessarily  true. 

(d)  Necessarily  true. 

(e)  Necessarily  true. 

(f)  Not  necessarily  true. 

(g)  Not  necessarily  true. 


2.1.27  (a) 


-3  -9  -3 
-6  -6  3 


(b) 


5  -18  5 
-11  4  4 
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(c)  [ 

(d) 

(e) 


-715' 


1  3 
3  9 


13  -16 
16  29 

1  -8 


5  7 
5  15 


(f) 

(g)  Not  possible. 


1 

-8 

5 


2.1.28  Show  that  \  (Ar  +A)  is  symmetric  and  then  consider  using  this  as  one  of  the  matrices.  A 

A+Ar  ,  A—AT 
2  '  2  ' 

2.1.29  If  A  is  symmetric  then  A  =  —A7 .  It  follows  that  an  =  —  an  and  so  each  a„  —  0. 


2.1.31  ( ImA)ij  =  £ j  8ikAkj  =  Ajj 

2.1.32  Yes  B  =  C.  Multiply  AB  =  AC  on  the  left  by  A-1 . 


2.1.34  A 


1  0  0 

0-10 
0  0  1 


2.1.35 


2.1.36 


-l 

r  3 

1  I 

2 

1  ' 

7 

7 

-1 

3 

i 

7 

2 

7 

'  0 

1  ' 

-l 

r  3 

5 

i  ■ 

5 

5 

3 

l 

0 

2.1.37 


2  1 
3  0 


'21' 

-l 

r  i  n 

2.1.38 

does  not  exist.  The  reduced  row-echelon  form  of  this  matrix  is 

z 

4  2 

1 

O 

O 

_ i 

a  b 

-l 

r  d 

b  ] 

ad— be 

ad— be 

c  d 

c 

a 

ad— be 

ad— be 

2.1.39 
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'  1 

2 

3  ' 

-l 

"  -2 

4  -5 

2.1.40 

2 

1 

4 

0 

1  -2 

1 

0 

2 

1 

-2  3 

'  1 

0 

3  ' 

! 

'  -2 

0  3  ' 

1  2 

3  3 

2.1.41 

2 

3 

4 

= 

0 

1 

0 

2 

1 

0  -1  _ 

2.1.42  The  reduced  row-echelon  form  is 


1  0  | 
0  .  1 
0  0  0 


There  is  no  inverse. 


2.1.43 


12  0  2 
11  2  0 
2  1-32 
12  12 


-1 


1  i  I  1 

2  2  2 


3 


1  _!  5 

2  2  2 


0 

1 

4 


1 

9 

4 


2.1.45  (a) 

X 

y 

— 

■  1  ■ 

2 

3 

z 

o 

X 

"  -12  ' 

(b) 

y 

z 

— 

1 

5 

X 

3c  —  2  a 

y 

= 

\b~\c 

z 

a  —  c 

2.1.46  Multiply  both  sides  of  AX  =  B  on  the  left  by  A  1 . 


2.1.47  Multiply  on  both  sides  on  the  left  by  A  1 .  Thus 

0  =  A-10  =  A-1  (AX)  =  (A-1  A)  X  =  IX  =  X 

2.1.48  A”1  —A~lI  =  A~l  (AB)  =  (A_1A)  B  =  IB  =  B. 

2.1.49  You  need  to  show  that  (A-1) 1  acts  like  the  inverse  of  A1  because  from  uniqueness  in  the  above 
problem,  this  will  imply  it  is  the  inverse.  From  properties  of  the  transpose, 

at  {a~1)t  =  (a~1a)t  =  IT  —  I 
(A~l)T  At  =  (AA_1)r  =  IT  —  I 

Hence  (A-1)7  =  (A1)  1  and  this  last  matrix  exists. 
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2.1.50  (AB)B~1A-1  =A(BB~1)A~ 1  =  AA~l  =  I  B~lA~l  (AB)  =  B~l  (A~lA)  B  =  B  lIB  =  B~XB  =  I 

2.1.51  The  proof  of  this  exercise  follows  from  the  previous  one. 

2.1.52  A2  (A~1)2  =  AAA~lA~l  =AIA~l  =  AA~1  =7  (A~1)2  A2  =  A~XA~XAA=  A~XIA=  A~XA=  I 

2.1.53  A~XA  —  AA~l  =  I  and  so  by  uniqueness,  (A-1)  1  =  A. 

2.2.1 

ri2oi  r i  o  o  i  r  i  20' 

213  =  210  0-33 

123j  [  i  °  1  J  L  0  03 

2.2.2 

'1232]  [1  0  0  ]  [  1  2  3  2' 

1  3  2  1  =  1  1  0  0  1  -1  -1 

5  0  1  3  J  L  5  -10  1  J  L  0  0  -24  -17 

2.2.3 

1-2  -5  0]  [  1  0  0  ]  [  1  -2  -5  0  ' 

—2  5  11  3  =  —2  1  0  0  1  1  3 

3  -6  -15  1  J  [  3  0  1  J  L  0  0  0  1  . 

2.2.4 

[  1  -1  -3  -1  ]  [  1  0  0  ]  [  1  -1  -3  -1  ' 

—  1  2  4  3  =  — 1  1  0  01  12 

2  -3  -7  -3  J  [  2  -1  1  J  L  0  0  0  1 

2.2.5 

1  -3  -4  -3  ]  [  1  0  0  ]  [  1  -3  -4  -3  ' 

-3  10  10  10  =  —3  1  0  0  1-21 

1-6  2  -5  J  L  1  "3  1  J  L  0  0  0  1 

2.2.6 

'1  3  1  -1  ]  [1  0  0  ]  [  1  3  1  -1  " 

3  10  8-1  =  3  10  015  2 

2  5  -3  -3  J  [  2  -1  1  J  [  0  0  0  1 

2.2.7 

[  3-2  1  ]  [  1  0  0  0  ]  [  3  -2  1  ' 

9-8  6  _  3  1  00  0  -2  3 

-6  2  2  “  -2  1  1  0  0  0  1 

3  2  -7  J  [  1  -2  -2  1  J  L  0  00 

2.2.9 

'  -1  -3  -1  ]  [  1  0  0  0  ]  [  -1  -3  -1  ' 

1  3  0—11  00  00-1 

3  90“  -3  0  10  00  -3 

4  12  16  J  [  -4  0  -4  1  J  [  0  0  0 
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2.2.10  An  LU  factorization  of  the  coefficient  matrix  is 


'  1 

2  ' 

'  1 

0  ' 

'1  2 

2 

3 

2 

1 

0  -1 

First  solve 


"  1 

0  ' 

u 

'  5  ' 

2 

1 

V 

6 

which  gives 


.  Then  solve 


"  1 

2  ' 

X 

5  ' 

0 

-1 

_  y  _ 

-4 

which  says  that  y  —  4  and  x  =  —  3. 


2.2.11  An  LU  factorization  of  the  coefficient  matrix  is 


1 

K> 

_ i 

i 

o 

o 

1 - 

<N 

0  1  3 

= 

0  1  0 

0  1  3 

.  2  3  0  . 

2  -1  1 

.  0  0  1  . 

First  solve 


i 

o 

o 

u 

'  1 ' 

0  1  0 

V 

= 

2 

2  -1  1 

w 

6 

which  yields  u~  l,v  =  2,w  =  6.  Next  solve 


"12  1" 

X 

'  1 ' 

0  1  3 

y 

= 

2 

0  0  1 

z 

6 

This  yields  z  —  6 ,y  —  —  16,jc  =  27. 

2.2.13  An  LU  factorization  of  the  coefficient  matrix  is 


'12  3' 

i 

o 

o 

r  i  2  3i 

2  3  1 

= 

2  1  0 

o 

1 

1 

3  5  4 

.  3  1  1  . 

.  0  0  0  J 

First  solve 


1 

o 

o 

u 

5 

2  1  0 

V 

= 

6 

.  3  1  1  . 

w 

11 

u 

5 

Solution  is: 

V 

= 

-4 

.  Next  solve 

w 

0 

r  i  2  3i 

X 

5 

o 

1 

1 

y 

— 

-4 

L  o  0  0  _ 

z 

0 
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Solutionis:  y  =  4  —  5 1  ,t  £  M. 

_  z  _  _  t 

2.2.14  Sometimes  there  is  more  than  one  LU  factorization  as  is  the  case  in  this  example.  The  given 
equation  clearly  gives  an  LU  factorization.  However,  it  appears  that  the  following  equation  gives  another 
LU  factorization. 

0  1 1  _  r 1  o  i  r  o  i 

0  1 J  ~  [  °  i  J  [  °  i 

3.1.3  (a)  The  answer  is  31. 

(b)  The  answer  is  375. 

(c)  The  answer  is  —2. 

3.1.4 

1  2  1 
2  1  3  =6 
2  1  1 

3.1.5 

1  2  1 
1  0  1  =2 
2  1  1 

3.1.6 

1  2  1 
2  1  3  =6 
2  1  1 

3.1.7 

10  0  1 
2  110 
0  0  0  2 
2  13  1 

3.1.9  It  does  not  change  the  determinant.  This  was  just  taking  the  transpose. 

3.1.10  In  this  case  two  rows  were  switched  and  so  the  resulting  determinant  is  —1  times  the  first. 

3.1.11  The  determinant  is  unchanged.  It  was  just  the  first  row  added  to  the  second. 

3.1.12  The  second  row  was  multiplied  by  2  so  the  determinant  of  the  result  is  2  times  the  original  deter¬ 
minant. 

3.1.13  In  this  case  the  two  columns  were  switched  so  the  determinant  of  the  second  is  —1  times  the 
determinant  of  the  first. 
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3.1.14  If  the  determinant  is  nonzero,  then  it  will  remain  nonzero  with  row  operations  applied  to  the  matrix. 
However,  by  assumption,  you  can  obtain  a  row  of  zeros  by  doing  row  operations.  Thus  the  determinant 
must  have  been  zero  after  all. 

3.1.15  det  (aA)  =  det  ( alA )  =  det  ( al )  det  (A)  =  cin  det  (A) .  The  matrix  which  has  a  down  the  main  diagonal 
has  determinant  equal  to  an. 


3.1.16 


f 

r i  2i 

- 1 

1 

_ 1 

{ 

i 

u> 

-1^ 

1 _ 

L  -5  6  J 

det 


1  2 
3  4 


det 


-1  2 
5  6 


=  -8 


=  -2x4  =  -8 


3.1.17  This  is  not  true  at  all.  Consider  A 


1  0 
0  1 


,B  = 


-1  0 
0  -1 


3.1.18  It  must  be  0  because  0  =  det  (0)  =  det  (Ak)  —  (det  (A))fc . 


3.1.19  You  would  need  det  ( 'AAt )  =  det  (A)  det  ( Ar )  =  det(A)9  =  1  and  so  det(A)  =  1,  or  — 1. 

3.1.20  det  (A)  =  det  (S~l 2BS)  =  det^1)  det(£)det(S)  =  det  (5)  det  (S-1*?)  =det(£). 


3.1.21  (a)  False.  Consider 


1  1  2 
-15  4 
0  3  3 


(b)  True. 

(c)  False. 

(d)  False. 

(e)  True. 

(f)  True. 

(g)  True. 

(h)  True. 

(i)  True. 

(j)  True. 


3.1.22 


1  2  1 

2  3  2 
-4  1  2 


-6 
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3.1.23 


2  1  3 

2  4  2 

1  4  -5 


=  —32 


3.1.24  One  can  row  reduce  this  using  only  row  operation  3  to 

'1  2  1  2  ' 

0  -5  -5  -3 

0  0  2  f 

0  0  0  -^ 

and  therefore,  the  determinant  is  —63. 

1  2  1 

3  1  -2 

-10  3 

2  3  2 


3.1.25  One  can  row  reduce  this  using  only  row  operation  3  to 


Thus  the  determinant  is  given  by 


14  1  2 

0  -10  -5  -3 

0  0  2  ^ 

0  0  0 


14  12 

3  2-2  3 

-10  3  3 

2  1  2-2 


-211 


3.2.1  det 


1  2  3 
0  2  1 
3  1  0 


—  13  and  so  it  has  an  inverse.  This  inverse  is 


2 

1 

0 

1 

0 

2 

1 

0 

3 

0 

3 

1 

2 

3 

1 

3 

1 

2 

1 

0 

3 

0 

3 

1 

2 

3 

1 

3 

1 

2 

2 

1 

0 

1 

0 

2 

1 

^13 


l 

3 

4 

13 

13 

13 

3 

9 

1 

13 

13 

13 

6 

5 

2 

13 

13 

13 

-1  3  -6 

3-9  5 

-4  -1  2 
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1  _2  2 
111 


6  5  2 

111 


3.2.3 

"13  3' 
det  2  4  1  =3 

.  0  1  1  . 

so  it  has  an  inverse  which  is 

1  0  -3 

2  15 

3  3  3 

2  _  1  2 

3  3  3 


3.2.5 

10  3" 

1  0  1  =2 
3  10 

and  so  it  has  an  inverse.  The  inverse  turns  out  to  equal 

4  I  oi 


1  _I  o 

2  2  u 

3.2.6  (a)  j  *  =1 

1  2  3 

(b)  0  2  1  =  —15 

4  1  1 

1  2  1 

(c)  2  3  0  =  0 

0  1  2 

3.2.7  No.  It  has  a  nonzero  determinant  for  all  t 


"12  0]  I"  1  3  -6  1  T 

3.2.2  det  0  2  1  =  7  so  it  has  an  inverse.  This  inverse  is  A  —2  15  = 

3  1  1  J  [2-12 


"1  t  t2  " 

det  0  12 1  —t3  +  2 

t  0  2 


and  so  it  has  no  inverse  when  t  —  —\[2 
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3.2.9 


det 


eJ  cosh/  sinh/ 
e 1  sinh/  cosh/ 
e1  cosh/  sinh/ 


=  0 


and  so  this  matrix  fails  to  have  a  nonzero  determinant  at  any  value  of  /. 


3.2.10 


det 


e~r  cost 

—e~l  cost  —  e~f  sin/ 
2e~l  sin/ 


e~f  sin/ 

—e~‘  sin t  +  e~‘  cost 
—2e~‘  cost 


and  so  this  matrix  is  always  invertible. 


=  5e~f  ±  0 


3.2.11  If  det  (A)  ^  0,  then  A  1  exists  and  so  you  could  multiply  on  both  sides  on  the  left  by  A  1  and  obtain 
that  X  =  0. 


3.2.12  You  have  1  =  det  (A)  det  ( B ).  Hence  both  A  and  B  have  inverses.  Letting  X  be  given, 

A(BA-I)X  =  (AB)AX  -AX  =  AX  -AX  =  0 

and  so  it  follows  from  the  above  problem  that  (BA  —  I)X  =  0.  Since  X  is  arbitrary,  it  follows  that  BA  =  7. 


3.2.13 


Hence  the  inverse  is 


3.2.14 


det 


0 

0 

Q  d  s*/-\c  4-  sit 


e  cos  t 

e‘  cost  —  ef  sin/ 


0 

ersin  / 

ef  cost  +  e1  sin/ 


=  e 


3 1 


e 


—3 1 


0  e2?cos/  +  e2rsin/  —  (e2t  cost  —  e2t  sin)  / 
0  —  e2fsin/  e2?cos(/) 


•-1  0  0 
0  e~‘  (cos/ +  sin/)  —  (sin/)e“f 

0  —  e~'  (cost  —  sin/)  (cos t)e~f 


ef  cost  sin  / 
er  —sin  /  cost 
e 1  —cost  —sin  / 

0 

j  cost  +  \  sin/  —sin  / 
\  sin/  —  \  cost  cost 


\  sin/  —  j  cost 
—  jcost—  j  sin/ 


3.2.15  The  given  condition  is  what  it  takes  for  the  determinant  to  be  non  zero.  Recall  that  the  determinant 
of  an  upper  triangular  matrix  is  just  the  product  of  the  entries  on  the  main  diagonal. 
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3.2.16  This  follows  because  det  {ABC)  —  det  (A)  det  (B)  det  (C)  and  if  this  product  is  nonzero,  then  each 
determinant  in  the  product  is  nonzero  and  so  each  of  these  matrices  is  invertible. 

3.2.17  False. 


3.2.18  Solutionis:  [x=l,y  =  0] 


3.2.19  Solution  is:  [x  —  l,y  =  0,z  =  0] .  For  example, 


4.2.1 


-55 

13 

-21 

39 


4.2.3 


4.2.4  The  system 


has  no  solution. 


1  1  1 

2  2-1 


2  -1  -1 

1  0  1 


4  ' 

3  ' 

2  ' 

4 

=  2 

1 

— 

-2 

-3 

-1 

1 

"  4  ' 

3  ' 

2  ' 

4 

4 

=  a\ 

1 

-1 

+  0.2 

-2 

1 

"  1  ' 

'  2  ' 

2 

0 

3 

• 

1 

4 

3 

17 


4.7.2  This  formula  says  that  u  •  v  —  ||w||  ||v||  cos  0  where  0  is  the  included  angle  between  the  two  vectors. 
Thus 

||w«v||  =  ||w|| ||v|| || cos 0 1[  <  ||m||||v|| 

and  equality  holds  if  and  only  if  0  —  0  or  %.  This  means  that  the  two  vectors  either  point  in  the  same 
direction  or  opposite  directions.  Hence  one  is  a  multiple  of  the  other. 


4.7.3  This  follows  from  the  Cauchy  Schwarz  inequality  and  the  proof  of  Theorem  4.29  which  only  used 
the  properties  of  the  dot  product.  Since  this  new  product  has  the  same  properties  the  Cauchy  Schwarz 
inequality  holds  for  it  as  well. 
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4.7.6  Ax»y  =  £* (Ax)kyk  =  'Lk'LiAmyk  =  LiLkAJkJCiyk  =x*Ary 

4.7.7 


ABx»y 


Bx»ATy 
x»BTATy 
x»  (AB)T y 


Since  this  is  true  for  all  x,  it  follows  that,  in  particular,  it  holds  for 

x  =  BTATy-(AB)Ty 
and  so  from  the  axioms  of  the  dot  product, 

( BTATy -  {ABfy^j  •  ( BTATy -  {AB)T yj  =  0 
and  so  BTATy  —  (AB)T y  —  0.  However,  this  is  true  for  all  y  and  so  BTAT  —  ( AB)T  =  0. 


4.7.8 


■1  -1 


.[  1 


4  2 


v'9+l+lv/l+16+4 


=  —0.197  39  =  cos  0  Therefore  we  need  to  solve 


VnV2i 

-0.19739  =  cos  0 


Thus  0  =  1.7695  radians. 


-10 


479  _ 

\/ 1  +4+  IV 1+4+49 

0  =  2.031 3  radians. 


4.7.10 


^5 

14 


4.7.11  l^lu 

u»u 


^5 

10 


4.7.12  m,u 

u*u 


1 


-0.55555  =  cos0  Therefore  we  need  to  solve  —0.55555  =  cos0, 


wmcn  gives 


5  " 

"  1  ■ 

14 

2 

5 

— 

7 

3 

15 

14 

"  1  ' 

r  1  i 

2 

0 

= 

0 

_  3  _ 

3 

2 

2 

-2 

1+4+9 


'  1  ' 

l  " 
14 

2 

1 

7 

3 

3 

0 

14 

0 

4.7.15  No,  it  does  not.  The  0  vector  has  no  direction.  The  formula  for  projg  (vv)  doesn't  make  sense  either. 
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4.7.16 

And  so 


You  get  equality  exactly  when  u  =  proj-M 


v  in  other  words,  when  u  is  a  multiple  of  v. 


4.7.17 

w  -  prop-?  (w)  +  u-  projp  («) 

=  w  +  u-  (projl7  (w)  +  proj-  (u) ) 

=  w  +  u  —  projp  ( w  +  u) 

This  follows  because 

ProJv \w)  +  ProJv  W  = 

V  V 

( u  +  w )  •  v_ 

||v||2 

=  proj^w  +  w) 

4.7.18  (v  —  proj-(v))  •  u  —  v»u  —  •u  —  v»u  —  v»u  —  0.  Therefore,  v  —  v  —  proj-(v)  +proj;7(v) . 

The  first  is  perpendicular  to  u  and  the  second  is  a  multiple  of  u  so  it  is  parallel  to  u. 

4.9.1  If  a  7^  0,  then  the  condition  says  that  \\a  x  u\\  —  ||a||  sin  0  =  0  for  all  angles  0.  Hence  a  —  0  after  all. 


4.9.4  [l  1  1  ]  x  [  2  2  2  ]  =  [  0  0  0  ] .  The  area  is  0.  It  means  the  three  points  are  on  the  same 
line. 
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4.9.7  (i  x  fj  x  j  =  k  x  j  —  —i.  However,  i  x  x  fj  —  0  and  so  the  cross  product  is  not  associative. 

4.9.8  Verify  directly  from  the  coordinate  description  of  the  cross  product  that  the  right  hand  rule  applies 
to  the  vectors  i,j,k.  Next  verify  that  the  distributive  law  holds  for  the  coordinate  description  of  the  cross 
product.  This  gives  another  way  to  approach  the  cross  product.  First  define  it  in  terms  of  coordinates  and 
then  get  the  geometric  properties  from  this.  However,  this  approach  does  not  yield  the  right  hand  rule 
property  very  easily.  From  the  coordinate  description, 

axb-a  =  EjjkCijbkaj  —  —Eji^ajb^cii  =  —Ej^b^ataj  =  —a  x  b  -a 

and  so  a  x  b  is  perpendicular  to  a.  Similarly,  axb  is  perpendicular  to  b.  Now  we  need  that 

\\a  x  b\\2  =  ||a||2||fo||2  (l  —  cos2  0)  =  ||a||2||S||2sin2  0 

and  so  || a  x  b\\  =  ||a||  ||&||  sin0,  the  area  of  the  parallelogram  determined  by  a,b.  Only  the  right  hand  rule 
is  a  little  problematic.  However,  you  can  see  right  away  from  the  component  definition  that  the  right  hand 
rule  holds  for  each  of  the  standard  unit  vectors.  Thus  i  x  j  =  k  etc. 

i  j  k 
10  0  =k 
0  1  0 

1  -7  -5 

4.9.10  1  -2  -6  =  113 

3  2  3 

4.9.11  Yes.  It  will  involve  the  sum  of  product  of  integers  and  so  it  will  be  an  integer. 

4.9.12  It  means  that  if  you  place  them  so  that  they  all  have  their  tails  at  the  same  point,  the  three  will  lie 
in  the  same  plane. 

4.9.13  x»  [a  x~b^j  =0 

4.9.15  Here  [v,  vv,z]  denotes  the  box  product.  Consider  the  cross  product  term.  From  the  above, 

(v  X  w)  X  {w  X  z)  =  [v,w,z\w—  [w,w,z]  v 

=  [v,vv,z]vv 

Thus  it  reduces  to 

(u  X  v)  •  [v,w,z\  w  =  [v,w,z]  [u,v,w] 

4.9.16 

||  1/  X  V 1 1  —  tij V k Eir\l<rVs  —  (&jr&ks  ^kr&js)  MrVslljVk 

—  UjVkUjVk  —  UkVjUjVk  =  ||w||2||v||2  —  («•  v)2 
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It  follows  that  the  expression  reduces  to  0.  You  can  also  do  the  following. 


|«xv||2  =  ||m||2||v||2  sin2  0 

=  ||m||2||v||2  (l  — cos2  0) 

=  ||m||2||v||2  —  ||m||2||v||2cos2  0 
=  ||w||  ||v||  —  (w«v) 


which  implies  the  expression  equals  0. 

4.9.17  We  will  show  it  using  the  summation  convention  and  permutation  symbol. 

((wxv)').  =  ((wxv)(.)'=  (eijkUjVky 

=  £ijku'jVk  +  £ijkukv'k  =  (u1  xv  +  uxv1). 

and  so  (u  x  v)/  =  u'  x  v  +  u  x  v' . 

4.10.10  £f=1 0xk  -  0 


4.10.40  No.  Let  u  =  q  •  Then  2u  ^  M  although  u  e  M. 


4.10.41  No.  G  M  but  10  $M. 


4.10.42  This  is  not  a  subspace. 


is  in  it.  However,  (—1)  is  not. 


4.10.43  This  is  a  subspace  because  it  is  closed  with  respect  to  vector  addition  and  scalar  multiplication. 

4.10.44  Yes,  this  is  a  subspace  because  it  is  closed  with  respect  to  vector  addition  and  scalar  multiplication. 


4.10.45  This  is  not  a  subspace.  ^  is  in  it.  However  (—1) 


is  not. 


4.10.46  This  is  a  subspace.  It  is  closed  with  respect  to  vector  addition  and  scalar  multiplication. 


4.10.55  Yes.  If  not,  there  would  exist  a  vector  not  in  the  span.  But  then  you  could  add  in  this  vector  and 
obtain  a  linearly  independent  set  of  vectors  with  more  vectors  than  a  basis. 
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4.10.56  They  can't  be. 

4.10.57  Say  £ ki=lazi  =  0.  Then  apply  A  to  it  as  follows. 

k  k 

E  CiA?i  =  E  ci™i  =  0 

7=1  7=1 

and  so,  by  linear  independence  of  the  vv,-,  it  follows  that  each  c,  =  0. 

4.10.58  If  x,y  G  V  fl  W,  then  for  scalars  a,/3,  the  linear  combination  ax  +  fiy  must  be  in  both  V  and  W 
since  they  are  both  subspaces. 


4.10.60  Let  {xi ,  •  •  •  ,Xk}  be  a  basis  for  V  fl  W.  Then  there  is  a  basis  for  V  and  W  which  are  respectively 

|xi, •  •  •  ,Xk,yk+\i ■  ■  ■  Ap}  ’  {*i, ■  ■  ■  •  •  •  jZg} 

It  follows  that  you  must  have  k  +  p  —  k +  q  —  k  <  n  and  so  you  must  have 

p+q—n <k 


4.10.61  Here  is  how  you  do  this.  Suppose  ABx  =  0.  Then  Bx  G  ker(A)  fl  B(RP)  and  so  Bx  =  \ 

showing  that 

k 

x~Y^i  e  ker(5) 
i=  1 

Consider  B  (R/;)  Piker  (A)  and  let  a  basis  be  {w\,  ■  ■  ■ ,  ny  } .  Then  each  vv,  is  of  the  form  Bzi  —  vv,.  Therefore, 
{?!,•••  ,Zk}  is  linearly  independent  and  ABz,  =  0.  Now  let  {u  i,  •  •  •  ,Ur}  be  a  basis  for  ker  (B) .  If  ABx  —  0, 
then  Bx  e  ker  (A)  fl  B  (W’)  and  so  Bx  =  Yq=i  pTC,  which  implies 

k 

x~YciZi  e  ker(fi) 
i=l 


and  so  it  is  of  the  form 


It  follows  that  if  ABx  =  (j  so  that  x  G  ker  (AB) ,  then 


r 


E  dJu.i 
j= i 


x  G  span  (?!,•••  ,Zk,uu---  ,ur) . 


Therefore, 


dim  (ker  {AB)) 


<  k  +  r  =  dim  ( B  (Mp)  fl  ker  (A))  +  dim  (ker  (5)) 

<  dim  (ker  (A) )  +  dim  (ker  (B) ) 


4.10.62  lix,y  G  ker  (A)  then 

A  (ax  A  by)  =  aAx  +  bAy  —  a0  +  b0  —  0 
and  so  ker  (A)  is  closed  under  linear  combinations.  Hence  it  is  a  subspace. 
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4.11.6  (a)  Orthogonal. 

(b)  Symmetric. 

(c)  Skew  Symmetric. 

4.11.7  ||t/.r||2  =  Ux»  Ux  —  UT Ux*x  —  lx*x—  ||x||2. 

Next  suppose  distance  is  preserved  by  U.  Then 

(U(x  +  y)).(U(x  +  y))  =  \\Ux\\2+  \\Uyf +  2(Ux.Uy) 

=  \\x\\2  +  \\y\\2  +  2(UTUx*y) 

But  since  U  preserves  distances,  it  is  also  the  case  that 

(U(x  +  y)*U(x  +  y))  =  ||x||2  + ||y||2  +  2(x«y) 

Hence 

x»y  =  U  Ux •  y 

and  so 

{{UTU-l)x)  *y  =  0 

Since  y  is  arbitrary,  it  follows  that  UTU  —1  =  0.  Thus  U  is  orthogonal. 

4.11.8  You  could  observe  that  det  (UUT)  =  (det(£/))2  =  1  so  det(t/)  ^  0. 
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This  requires  a 


4.11.11  Try 


This  requires  that  c 


4.11.12  (a) 


(b) 


(c) 


4 

3^2' 


r  2  72  1/9 1 

3  2  6vz' 

r  2  72  1/2 1 

3  2  6V  z 

1 

Cd 

2  -72  1 

!  3  2  3^2 

3  2  3^2 

1 

1  co 

O 

— <lm 

1 

1 

0 

U)  I 

<J-* 

toll 

1  0  0 
0  1  0 
0  0  1 


_  _2_ 
75 

0 

J_ 

75 


d 
AV5 


r  1 

3 

2 

75 

c 

2 

3 

0 

d 

2 

3 

1 

75 

±V5_ 

T 


c2  +  g  cd+l  ^V5c-A' 

cd  +  g  d~  +  9  \/57  +  9 

-  T5  —  45  15  +9  1 


-5 

3\/5‘ 


'  l 

3 

2 

3 

2 

3 


_ 

'75 

0 

j_ 

75 


2 

375 

-5 

375 


_  _2_ 
75 

0 

j_ 

75 


2 

375 

-5 

375 


1  0  0 
0  1  0 
0  0  1 


r  3  1 

54 

5 

0 

9 

1  1 

O  E'i|U>L'i|4^ 

1 _ 1 

9 

'  0  ' 
0 

1 

- 1 

O 

9 

r 4 1 
5 

0 

9 

'  0  ' 
1 

L  -5  J 

3 

L  5  J 

_  0  _ 

- 1 

O 

9 

r  4 1 
5 

0 

9 

'  0  ' 
1 

1 

’■sHto 

1 

3 

L  5  J 

0 

r^i 

r  ^  1 

r  1 

i-/6 

9 

1 

7 

L  \>ft  J 

[  -1V3  J 

4.11.13  A  solution  is 


567 


4.11.14  Then  a  solution  is 


1 

>-»!'— ON  I'— 

Osl  Osl 

r  W 2V3  i 

T~ 

JV6 

9 

9 

0 

.  lgV2V3  . 

-^ViVyi 

333  ^37 


4.11.15  The  subspace  is  of  the  form 


'  1  ' 

'  0  ' 

and  a  basis  is 

0 

9 

1 

2 

3 

x 

y 

2x  +  3y 


Therefore,  an  orthonormal  basis  is 


r+si 

0 

9 

L  ivs  J 

-^y/5y/U 

^y/5y/U 

IV5VU 


4.11.21 


Solution  is: 


2 

? 

L  3 


'12' 

T 

"12' 

2  3 

2  3 

3  5 

3  5 

14  23 
23  38 


14  23 
23  38 


"  1 

2 ' 

T 

“ 

1  ' 

r 

2 

3 

2 

_  3 

5  _ 

_ 

4 

- 

'  14 

23  ' 

- 

X 

'  17  ' 

23 

38 

y  _ 

28 

'  14 

23  ' 

" 

X 

■ 

'  17 

" 

23 

38 

y 

28 

9 

x 

y 


17 

28 


4.12.7  The  velocity  is  the  sum  of  two  vectors.  50/  +  ^  y  +  jj  —  [50  +  ^  j  /  +  ^  j.  The  component  in 
the  direction  of  North  is  then  ^  =  1 50\/2  and  the  velocity  relative  to  the  ground  is 


50  + 


300 


-  300- 
i  +  —j 


V2J  V2 


4.12.10  Velocity  of  plane  for  the  first  hour:  0  150  +  40  0  =  40  150  .  After  one  hour  it  is  at 


(40, 150) .  Next  the  velocity  of  the  plane  is  150 


+3 

2 


+  40  0  in  miles  per  hour.  After  two  hours 


it  is  then  at  (40, 150)  +  150 


+3 

2 


+  [40  0  ]  =  [  155  75v/3  +  150  ]  =  [  155.0  279.9' 
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4.12.11  Wind:  [  0  50  ]  .  Direction  it  needs  to  travel:  (3,5)  Then  you  need  250  [  a  b  ]  +  [  0  50  ] 
to  have  this  direction  where  [a  b  ]  is  an  appropriate  unit  vector.  Thus  you  need 

er  +  b 2  =  1 


2506  +  50  _  5 
250 a  3 

Thus  a  =  |,6  =  The  velocity  of  the  plane  relative  to  the  ground  is  [  150  250  ]  .  The  speed  of  the  plane 
relative  to  the  ground  is  given  by 

\J (150)2  +  (250)2  =  291.55  miles  per  hour 

It  has  to  go  a  distance  of  J (300)2  +  (500)2  =  583. 10  miles.  Therefore,  it  takes 


583.1 

291.55 


2  hours 


4.12.12  Water:  [  — 2  0  ]  Swimmer:  [  0  3  ]  Speed  relative  to  earth:  [  — 2  3  ]  .  It  takes  him  1  /6  of  an 
hour  to  get  across.  Therefore,  he  ends  up  travelling  ^\/4  +  9  =  ^\/l3  miles.  He  ends  up  1/3  mile  down 
stream. 


4.12.13  Man:  3  [  a  b  ]  Water:  [  —2  0  ]  Then  you  need  3a  =  2  and  so  a  =  2/3  and  hence  b  =  y/5/3. 
The  vector  is  then  |  . 

In  the  second  case,  he  could  not  do  it.  You  would  need  to  have  a  unit  vector  [a  b  ]  such  that  2  a  =  3 
which  is  not  possible. 


4.12.17  proj5  (f)  =  =  (j|F|| cos 0 )  ^  =  (j|F||cose)w 

4.12.18  40 cos  (^tt)  100  =  3758.8 

4.12.19  20 cos  (|)  200  =  3464.1 

4.12.20  20  (cos  f )  300  =  4242.6 

4.12.21  200  (cos  (§))  20  =  3464.1 

'  -4  1  [O' 

4.12.22  3  •  1  x  10  =  30  You  can  consider  the  resultant  of  the  two  forces  because  of  the  prop- 

_  -4  J  [  0  J 

erties  of  the  dot  product. 
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4.12.23 


1 

1 

/  w  \ 

1 - 

F\  • 

10  +  F2» 

10  =  [Fi  +F2J  • 

o 

0 

0 

10 


6  ' 
4 

• 

"  1  " 

f 

-4 

50V2 

V2 

0 

10 


4.12.24 


2  ' 

0  ' 

3 

• 

1 

V2 

-4 

1 

L  V2  J 

20  =  —  lOy/2 


5.1.1  This  result  follows  from  the  properties  of  matrix  multiplication. 

5.1.2 

Tn(av  +  bw) 


_  _  (av  +  bw»u)  _ 

aV  +  bW - rr— TT^S - « 


_  (vu)  _  (wu)  _ 

=  av  —  a  n  u  +  bw  —  b  n  u 

Wu\\  \\u\\ 

=  aTn  (v)  +  bTtf  (w) 

5.1.3  Linear  transformations  take  0  to  0  which  T  does  not.  Also  7g  (U  +  v)  ^  T^u  +  T$v. 

5.2.1  (a)  The  matrix  of  T  is  the  elementary  matrix  which  multiplies  the  jth  diagonal  entry  of  the  identity 

matrix  by  b. 


(b)  The  matrix  of  T  is  the  elementary  matrix  which  takes  b  times  the  jth  row  and  adds  to  the  itn  row. 

(c)  The  matrix  of  T  is  the  elementary  matrix  which  switches  the  ith  and  the  jth  rows  where  the  two 
components  are  in  the  ith  and  jth  positions. 


:th 


5.2.2  Suppose 


=  a\ 


]- 


Thus  cfaj  =  Sjj.  Therefore, 


[  b\  ■■■  bn][a\  •••  a„]  '  a,-  =  [bl  bn] 


Clj 
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=  [  b\  ■■■  b„]  et 

=  h 

Thus  Tdt—  [  b\  bn][a\  ■■■  an]  1  a,-  =  Aa;-.  If  x  is  arbitrary,  then  since  the  matrix  [  a\  ■■■  an  ] 
is  invertible,  there  exists  a  unique  y  such  that  [  a\  ■■  ■  an  ]y  —  x  Hence 


t*  =  T  ( Y  y&  =  Y  y*T®i =  Y  y‘AcIi  =  A  ( Y  y&  = 


,7=1 


i—  1 


i=  1 


.  /=  1 


5.2.3 


5.2.4 


5.2.5 


5.2.6 


5.2.7 


'  5 

1 

5  ' 

'  3 

2 

1  ' 

'  37 

17 

11  ' 

1 

1 

3 

2 

2 

1 

17 

7 

5 

_  3 

5 

-2 

■ 

4 

1 

1 

- 

11 

14 

6 

■ 

'  1 

2 

6  ' 

'  6 

3 

1  ' 

'  52 

21 

9  ' 

3 

4 

1 

5 

3 

1 

= 

44 

23 

8 

1 

1 

-1 

_  6 

2 

1 

5 

4 

1 

r- 

■3 

1 

5  ' 

'  2 

2 

1  ' 

15 

1 

3  " 

i 

3 

3 

1 

2 

1 

= 

17 

11 

7 

L 

3 

-3 

-3  _ 

4 

1 

1 

_  -9 

-3 

— 

3 

"  3 

1 

1  ' 

'  6 

2 

1  ' 

'  29 

9 

5  ' 

3 

2 

3 

5 

2 

1 

= 

46 

13 

8 

3 

3 

-1 

6 

1 

1 

27 

11 

5 

'  5 

3 

2  ' 

'  11 

4 

1  ' 

'  109 

38 

10  ' 

2 

3 

5 

10 

4 

1 

= 

112 

35 

10 

5 

5 

-2 

12 

3 

1 

81 

34 

8 

5.2.11  Recall  that  proj-  (v)  =  pjp“  and  so  the  desired  matrix  has  ith  column  equal  to  proj;7  (ei) .  Therefore, 
the  matrix  desired  is 


1  5  3 

5  25  15 
3  15  9 


5.2.12 


1 

35 


5.2.13 
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1 

To 


1  O  3 
0  0  0 
3  0  9 


5.3.2  The  matrix  of  S  o  T  is  given  by  BA. 


'  0 

-2  ' 

3 

1  ' 

2 

-4  ' 

4 

2 

-1 

2 

10 

8 

Now,  (5  oT)  (3c)  =  (BA)x. 


2 

-4  ' 

2  ' 

8  ' 

10 

8 

-1 

12 

5.3.3  To  find  (SoT)  (3c)  we  compute  S(T(x)). 


12' 

2  ' 

-4  ' 

-1  3 

-3 

-11 

5.3.5  The  matrix  of  T  1  is  A  1 . 


2  1 
5  2 


-2  1 
5  -2 


5.4.1 

5.4.2 


cos(f)  -Sin(f)  ' 

[  i  4V3  1 

sin(f)  cos(f)  _ 

Uv/3  1  J 

cos(f)  -sin(f) 

-V2i 

sin  (f )  cos(f) 

[  bn  y*  \ 

5.4.3 

5.4.4 


cos 

sin 

cos 

sin 


(-!) 

—  sin  (— f )  ' 

r  l 

2 

^1 

H) 

cos(-f)  _ 

1-V3 

1 

2  J 

)  -sin(f)  ' 

r  l 

2 

4^1 

)  cos(¥)  . 

L  V3 

1 

2  - 

5.4.5 


(!) 

-sin(f)  ' 

cos(-f)  —sin  (—4) 

(!) 

cos(f)  _ 

_  sin(-f)  cos(-f) 

JV2V3  +  JV2  \V2-\V2V3 
\V2V3-\V2  \V2V3  +  \V2 


5.4.6 


'  1 

0  ' 

'  cos(f) 

-sin(f)  1 

r  1 

2 

-V3  1 

0 

-1 

.  sin(¥) 

cos(f)  . 

1 

2 
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5.4.7 


5.4.8 


5.4.9 


5.4.10 


5.4.11 


5.4.12 


5.4.13 


1 

ol 

'cos(f)  - 

-sin(f)  ' 

1 

2 

473 

0  -lj 

.  sin(f)  - 

cos(f)  _ 

■ 

-V3 

1 

2 

1 

ol 

cos(f)  - 

-sin(f)  ' 

r 

'  *72 

472 

0  -lj 

.sin  (f)  ■ 

cos(f)  _ 

- 

.472 

472 

'-10 

'  r  cos(f) 

-sin  (f)  ' 

1 

[473 

1  ] 

2 

0  1 

.  L  sin(f) 

cos(f) 

- 

1 

L  2 

i73  J 

cos(f) 

-sin(f)  ' 

0 

1 

r  472 

*72  ] 

.  sin(!) 

cos(f)  . 

[0  -1 

- 

472  - 

472  J 

cos(f) 

-sin  (f)  ' 

'-10' 

r 

472 

472 

sin(f) 

cos(f)  _ 

0  1 

- 

.  472 

^72 

'  cos  (§) 

—  sin  (§)  ' 

[1 

0 

-I 

473 

1  1 

2 

.  sin(!) 

cos(f)  _ 

[0  -1 

- 

1 

.  2 

473  J 

'  cos(f) 

-sin(f)  ' 

'-10 

-I 

[473 

1  “I 

2 

.  sin(f) 

cos(f)  _ 

0  1 

1 

2 

i73  J 

5.4.14 


cos 

(¥) 

-sin  (4  ' 

COS  1 

C-f) 

—  sin  (— f )  ' 

sin  ( 

'2  71  \ 

<  3  > 

cos(f)  _ 

sin  ( 

-I) 

cos(-f) 

'  \V2V3-\V2  -\V2V3-\V2 
_  iv/2v/3  +  ^v/2  \V2V3-\V2 

Note  that  it  doesn’t  matter  about  the  order  in  this  case. 


5.4.15 


'  1 

0 

0  ' 

"  cos(f) 

—  sin  (f ) 

0  ' 

473 

1 

2 

0 

0 

1 

0 

sin  (f ) 

cos(f) 

0 

= 

1 

2 

173 

0 

0 

0 

-1 

0 

0 

1 

0 

0 

-1  _ 

5.4.16 


cos(0) 

—  sin(0) 

'1  O' 

COS  (  —  0) 

—  sin(— 0) 

sin(0) 

cos(0) 

0  -1 

sin(— 0) 

cos ( — 0) 

cos 2  0  —  sin2  0  2  cos  0  sin  0 

2  cos  0  sin  0  sin2  0  — cos2  0 
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Now  to  write  in  terms  of  (a,b) ,  note  that  a/Vci2  +  b2  =  cos  6,b/ \/a2  +  b2  —  sin  0 .  Now  plug  this  in  to  the 
above.  The  result  is 


f  a2-b2 

o  ab 

1 

a2 — b2  lab 

a2+b 2 

a2+b2 

o  ab 

b2—a2 

a2  +  b 2 

lab  b2  —  a 2 

a2+b 2 

a2+b 2 

Since  this  is  a  unit  vector,  a2  +  b2  —  1  and  so  you  get 

a 2  —  b2  lab 

lab  b2  —  a2 


5.5.5 


1  0 
0  1 
0  0 


5.5.6  This  says  that  the  columns  of  A  have  a  subset  of  m  vectors  which  are  linearly  independent.  Therefore, 
this  set  of  vectors  is  a  basis  for  M"!.  It  follows  that  the  span  of  the  columns  is  all  of  M"!.  Thus  A  is  onto. 


5.5.7  The  columns  are  independent.  Therefore,  A  is  one  to  one. 


5.5.8  The  rank  is  n  is  the  same  as  saying  the  columns  are  independent  which  is  the  same  as  saying  A  is 
one  to  one  which  is  the  same  as  saying  the  columns  are  a  basis.  Thus  the  span  of  the  columns  of  A  is  all  of 
K"  and  so  A  is  onto.  If  A  is  onto,  then  the  columns  must  be  linearly  independent  since  otherwise  the  span 
of  these  columns  would  have  dimension  less  than  n  and  so  the  dimension  of  Wl  would  be  less  than  n  . 


5.6.1  If  Y!i  aAr  —  0,  then  using  linearity  properties  of  T  we  get 

0  =  r(0)  =  r(£a,vr)  =  £a(r(vr). 
i  i 

Since  we  assume  that  {Tv i,  •  •  • , Tv,-}  is  linearly  independent,  we  must  have  all  a,  =  0,  and  therefore  we 
conclude  that  {vi,  •  •  •  ,vr}  is  also  linearly  independent. 

5.6.3  Since  the  third  vector  is  a  linear  combinations  of  the  first  two,  then  the  image  of  the  third  vector  will 
also  be  a  linear  combinations  of  the  image  of  the  first  two.  However  the  image  of  the  first  two  vectors  are 
linearly  independent  (check!),  and  hence  form  a  basis  of  the  image. 

Thus  a  basis  for  im(T)  is: 


5.7.1  In  this  case  dim(W)  =  1  and  a  basis  for  W  consisting  of  vectors  in  S  can  be  obtained  by  taking  any 
(nonzero)  vector  from  S. 
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5.7.2  A  basis  for  ker  (T)  is 


1 

-1 


and  a  basis  for  im(r)  is 


There  are  many  other  possibilities  for  the  specific  bases,  but  in  this  case  dim  (ker  (T))  —  1  and  dim  (im  (T)  ) 

1. 

5.7.3  In  this  case  ker(T)  =  {0}  and  im(T)  =  M2  (pick  any  basis  of  M2). 

5.7.4  There  are  many  possible  such  extensions,  one  is  (how  do  we  know?): 


r 

■  i  ■ 

"  -1  ' 

'  0  ' 

1 

) 

i 

9 

2 

9 

0 

{ 

i 

i 

-1 

1 

I 

5.7.5  We  can  easily  see  that  dim(im(r))  =  1,  and  thus  dim(ker(T))  =  3  —  dim(im(T))  =  3  —  1  =  2. 
5.8.2  CB(x)  = 


2 

1 

-1 


5.8.3  Mb2bx  = 


1  0 

-1  1 


'  -3 1  ' 

'  -3  ' 

5.9.1  Solution  is: 

-t 

,  ?3  e  M  .  A  basis  for  the  solution  space  is 

-1 

t 

1 

f 

-3%  ' 

0  ' 

5.9.2  Note  that  this  has  the  same  matrix  as  the  above  problem.  Solution  is: 

~h 

+ 

-1 

h  _ 

0 

,t3  e 


5.9.3  Solution  is: 


3 1  ' 

'  3  ' 

It 

,  A  basis  is 

2 

t 

1 

5.9.4  Solution  is: 


3 1 
2 1 
t 


+ 


-3 

-1 

0 


JeR 


5.9.5  Solution  is: 


A  basis  is 


-4 

-2 

1 


1 

1 

0  ' 

5.9.6  Solution  is: 

—2? 

t 

+ 

-1 

0 

5.9.7  Solution  is: 


,teR. 


-f 

It 

t 


"  -f  " 

'  -1  ' 

5.9.8  Solutionis: 

It 

+ 

-1 

t 

0 

5.9.9  Solution  is: 


0 

-f 

-f 

? 


,teR 


5.9.10  Solution  is 


0 

r  2 

-f 

-1 

+ 

-1 

-t 

t 

0 

5.9.11  Solution  is: 


— s  —  t 


s 

s 

t 


,s,/ef.  A  basis  is 


5.9.12  Solution  is: 


5.9.13  Solutionis: 


for  s,t  e  R.  A  basis  is 
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5.9.14  Solutionis: 


r  3  i 
1 

—  is  —  It 

1  1 

2s -y 

2 

+ 

0 

s 

0  _ 

t 

5.9.15  Solutionis: 


5.9.16  Solution  is: 


-f 

t 

i 

0 

-t 

i 

i 

0 


,  a  basis  is 


1 

1 

1 

0 


+ 


-9 

5 

0 

6 


,t  e  R. 


5.9.17  If  not,  then  there  would  be  a  inhntely  many  solutions  to  Ax  —  0  and  each  of  these  added  to  a  solution 
to  Ax  —  b  would  be  a  solution  to  Ax  =  b. 


6.1.1  (a)  z  +  w  —  5  —  i 


(b)  z  —  2w  —  — 4  H-  23/ 

(c)  zw  —  62  +  5  i 


(d) 


W 

z 


50  _  37; 
53  53 1 


6.1.4  If  z  =  0,  let  w  =  1.  If  z  7^  0,  let  w  = 

\z\ 

6.1.5 

(a  +  hi)  (c  +  di)  =  ac  —  bd  +  {ad  +  be)  i  —  (ac  —  bd )  —  {ad  +  be)  i  {a  —  bi )  {c  —  di)  =  ac  —  bd  —  {ad  +  be)  i 

which  is  the  same  thing.  Thus  it  holds  for  a  product  of  two  complex  numbers.  Now  suppose  you  have  that 
it  is  true  for  the  product  of  n  complex  numbers.  Then 

Z\  '  ' '  Zn- hi  ~  Z{  *  *  *  Zn  Zn+ 1 

and  now,  by  induction  this  equals 

Z 1  Zyi  Zf i~\~  1 


n  n  n 

E  fa+iyj)  =  E  xj + 1 E  yj 

7=1  7=1  7=1 

n  n  n  n  _ 

E  xj  ~ 1 E  >’7  =  E  xj  -  vj  =  E  (xj + w)  • 

7=1  7=1  7=1  7=1 


As  to  sums,  this  is  even  easier. 
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6.1.6  If  p  ( z )  =  0,  then  you  have 


p(z)  =  0  =  anzn  +  an-\zn  1  H - f-  a\z  +  ao 


=  anzn  +  an-izn  - haiz  +  ao 

=  Cln  z'1  +  Cln—  \  ZH  ^  +  •  •  •  +  61 1  Z  +  ClQ 

=  Clnz’.1  +  &n—  I  Z.n  l+'-'+tfiZ  +  tfO 


=  p(z) 


6.1.7  The  problem  is  that  there  is  no  single 

6.2.5  You  have  z  =  |z|  (cos  0  +  /sin0)  and  w  =  \w\  (cos0  +  zsin0) .  Then  when  you  multiply  these,  you 
get 

|z|  \w\  (cos  0  +  zsin0)  (cos0  +  zsin0) 

=  |z|  \w\  (cos  0  cos  0  —  sin  0  sin 0  +  /(cos0sin<(>  +  cos<(>  sin0)) 

=  |z|  |w|  (cos(0  +  0)  +  zsin(0  +  0)) 

6.3.1  Solution  is: 

(1-0  a/2, -(1  +  0  a/2,  —  (1  —  0  a/2,  (i  +  o  a/2 

6.3.2  The  cube  roots  are  the  solutions  to  z3  +  8  =  0,  Solution  is:  i\J 3  +  1,1  —  ?V 3,  —2 

6.3.3  The  fourth  roots  are  the  solutions  to  z4  +  16  =  0,  Solution  is: 

(1-0  a/2, -(1  +  0  a/2,  -(1-0  a/2,  (1  +  0  a/2 


6.3.4  Yes,  it  holds  for  all  integers.  First  of  all,  it  clearly  holds  if  n  =  0.  Suppose  now  that  n  is  a  negative 
integer.  Then  —n>  0  and  so 


[r(cost  +  /sint)]n 


1 

[r  (cost  +  z  sin  t)]~n 


1 

r~n  (cos(— nt)  +  /sin(—  nt)) 


r'1  r"  (cos  (nt)  +  i  sin  (nt)) 

(cos  (nt)  —  i  sin  (nt))  (cos  (nt)  —  i  sin  (nt))  (cos  (nt)  +  i  sin  (nt)) 
=  r'1  (cos  (nt)  +  i  sin  (nt)) 

because  (cos  (nt)  —  / sin  (nt))  (cos  (nt)  +  z  sin  (nt))  =  1. 

6.3.5  Solution  is:  iy/'. 3  +  1,1—  iy/3,  —2  and  so  this  polynomial  equals 

(x  +  2)  (x  —  (iV3  +  (x  —  —  /a/3 j  j 


6.3.6  x3  +  27  =  (x  +  3)  (x2  —  3x  +  9) 
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6.3.7  Solution  is: 

(1-0  a/2,-(1  +  z)  a/2, -(1-/)V2,  (1  +  0  a/2. 
These  are  just  the  fourth  roots  of  — 16.  Then  to  factor,  you  get 

(x- ((1-0^))  (*-  (-(1  +  0^2))  • 

(.r-  (-(l-i)V2))  (*-  ((1  +  0^2)) 


6.3.8  x4  +  16  =  ^x2  —  2  v/2x  +  4 j  ^x2  +  2\/2x  +  4j  .  You  can  use  the  information  in  the  preceding  prob¬ 
lem.  Note  that  (x  —  z)  (x  —  z)  has  real  coefficients. 

6.3.9  Yes,  this  is  true. 


(cos  0  —  z'sin0)"  =  (cos(— 0) +/sin(— 0))" 

=  cos  ( — n  0 )  +  z'  sin  ( — zz0) 

=  cos(zz0)  —  z'sin(z70) 

6.3.10  p  (x)  =  (x  —  zi )  q  (x)  +  r  (x)  where  r  (x)  is  a  nonzero  constant  or  equal  to  0.  However,  r  (zi )  =  0  and 
so  r(x)  =  0.  Now  do  to  ^(x)  what  was  done  to  p(x)  and  continue  until  the  degree  of  the  resulting  q(x) 
equals  0.  Then  you  have  the  above  factorization. 


6.4.1 


(x  —  (1  +  z))  (x—  (2  +  z))  —  x2  —  (3  +  2 z)x  +  1  +  3z 


6.4.2  (a)  Solution  is:  1  +  z,  1  —  z 

(b)  Solutionis:  g/v/35  —  g,  — gzV 35  — g 

(c)  Solution  is:  3  +  2 z,  3  —  2 z 

(d)  Solution  is:  zy^  —  2,  — z'v^  — 2 

(e)  Solution  is:  —  \  +  z,  —  \  —  i 


6.4.3  (a)  Solution  is  :  x  =  — 1  +  jV2  —  x  =  —  T—  ^a/2+ 

(b)  Solution  is  :  x  =  1  —  |z,  x  =  —  1  —  \i 

(c)  Solution  is  :  x  =  —  x  =  —  \  —  i 

(d)  Solution  is  :  x  =  —  1  +  2z,  x  =  1  +  2z 

(e)  Solution  is  :x=-g  +  g\/l9+(g-gv/19)  6  *  = -g  -  gA/l9+ (g  +  gVl9 )  z 

7.1.1  AmX  —  XmX  for  any  integer.  In  the  case  of  —  1,A_1  X.X  —  AA~{X  =  X  so  A_1Y  =  X~lX.  Thus  the 
eigenvalues  of  A-1  are  just  A-1  where  A  is  an  eigenvalue  of  A. 
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7.1.2  Say  AX  =  XX.  Then  cAX  =  cXX  and  so  the  eigenvalues  of  cA  are  just  cX  where  X  is  an  eigenvalue 
of  A. 

7.1.3  BAX  =  ABX  =  AXX  =  XAX.  Here  it  is  assumed  that  BX  =  XX. 


7.1.4  Let  X  be  the  eigenvector.  Then  A"'X  =  X mX,AmX  —  AX  =  XX  and  so 

Xm  =  X 

Hence  if  X  ^  0,  then 


A'"-1  =  1 


and  so  |A|  =  1. 


7.1.5  The  formula  follows  from  properties  of  matrix  multiplications.  However,  this  vector  might  not  be 
an  eigenvector  because  it  might  equal  0  and  eigenvectors  cannot  equal  0. 


7.1.14  Yes. 


0  1 
0  0 


works. 


7.1.16  When  you  think  of  this  geometrically,  it  is  clear  that  the  only  two  values  of  6  are  0  and  n  or  these 
added  to  integer  multiples  of  2 n 


7.1.17  The  matrix  of  T  is 


1  0 
0  -1 


.  The  eigenvectors  and  eigenvalues  are: 


0 

1 


**  -1, 


1 

0 


1 


7.1.18  The  matrix  of  T  is 


0 

1 


-1 

0 


.  The  eigenvectors  and  eigenvalues  are: 


—i 


l 


7.1.19  The  matrix  of  T  is 


1  0 
0  1 
0  0 


0 

0 

-1 


The  eigenvectors  and  eigenvalues  are: 


0 

0 

1 


**  -1, 


1 

0 

0 


0 

1 

0 


o  1 


7.2.1  The  eigenvalues  are  —1,  —  1, 1.  The  eigenvectors  corresponding  to  the  eigenvalues  are: 


10 

-2 

3 


++  -1, 


1 


Therefore  this  matrix  is  not  diagonalizable. 
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7.2.2  The  eigenvectors  and  eigenvalues  are: 


The  matrix  P  needed  to  diagonalize  the  above  matrix  is 

"2-2  7 

0  1  -2 
1  0  2 

and  the  diagonal  matrix  D  is 

"10  0" 
0  1  0 
_  0  0  3  _ 

7.2.3  The  eigenvectors  and  eigenvalues  are: 


The  matrix  P  needed  to  diagonalize  the  above  matrix  is 

'  -6  -5  -8  ' 

-1  -2  -2 
2  2  3 

and  the  diagonal  matrix  D  is 

"  6  0  0  ' 

0-3  0 

_ 0  o  -2  _ 

7.2.8  The  eigenvalues  are  distinct  because  they  are  the  n,h  roots  of  1.  Hence  if  A  is  a  given  vector  with 

x=t  “ivi 

7=1 


then 


so  Anm  —  I. 


AnrnX  =  An,n  £  cijVj  =  £  ajAnmVj  =  £  ajVj  =  X 
7=1  7=1  7=1 


7.2.13  AX  =  (a  +  ib)X.  Now  take  conjugates  of  both  sides.  Since  A  is  real, 

AX  =  (a-ib)X 
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7.3.1  First  we  write  A  =  PDP  1 . 


11  -10 
11  0  3 


Therefore  A 10  =  PD 1  °P  1 . 


(-1)10  0 
0  310 


29525  29524 
29524  29525 


7.3.4  (a)  Multiply  the  given  matrix  by  the  initial  state  vector  given  by  81  .  After  one  time  period 

[  85  _ 

there  are  89  people  in  location  1,  106  in  location  2,  and  61  in  location  3. 

Ms 

(b)  Solve  the  system  given  by  (I  —  A)XS  =  0  where  A  is  the  migration  matrix  and  Xs  =  xis  is  the 

L  *3.5  _ 

steady  state  vector.  The  solution  to  this  system  is  given  by 

8 

Xls  =  ^x3s 

63 

x2s  —  25X^S 

Letting  x3s  =  t  and  using  the  fact  that  there  are  a  total  of  256  individuals,  we  must  solve 

8  63 

-t+ -,  +  1-256 

We  find  that  t  =  50.  Therefore  after  a  long  time,  there  are  80  people  in  location  1,  126  in  location  2, 
and  50  in  location  3. 


7.3.6  We  solve  (/  —  A)XS  =  0  to  find  the  steady  state  vector  Xs  =  x3s  ■  The  solution  to  the  system  is 

.  X3.S  . 

given  by 


-*ls  —  rx3s 

o 


X2s  —  ^  X3s 
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Letting  x^s  —  t  and  using  the  fact  that  there  are  a  total  of  480  individuals,  we  must  solve 

5  2 

—t  —t  -\~t  —  480 

6  3 

We  find  that  t  =  192.  Therefore  after  a  long  time,  there  are  160  people  in  location  1,  128  in  location  2,  and 
192  in  location  3. 


7.3.9 


*3 


0.38 

0.18 

0.44 


Therefore  the  probability  of  ending  up  back  in  location  2  is  0.18. 


7.3.10 


X2  = 


0.367 

0.4625 

0.1705 


Therefore  the  probability  of  ending  up  in  location  1  is  0.367. 


7.3.11  The  migration  matrix  is 


A  = 


to 

j_ 

10 

J_ 

10 

J_ 

10 


J_ 

10 

I 

5 

3 

5 

J_ 

10 


1 

5 

J_ 

10 

1 

5 

1 

2 


To  find  the  number  of  trailers  in  each  location  after  a  long  time  we  solve  system  (/  —  A)XS  —  0  for  the 

xis 


steady  state  vector  Xs 


X2s 

X3s 

X4s 


.  The  solution  to  the  system  is 


9 

x\ s  —  jo*4* 

12 

8 

X3  s  — 


Letting  V4v  =  t  and  using  the  fact  that  there  are  a  total  of  413  trailers  we  must  solve 


9 

To 


t  + 


12 

T 


8 

t  +  —t  + 1 


413 


We  find  that  t  =  70.  Therefore  after  a  long  time,  there  are  63  trailers  in  the  SE,  168  in  the  NE,  1 12  in  the 
NW  and  70  in  the  SW. 
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7.3.19  The  solution  is 


eAtC  = 


7.4.1  The  eigenvectors  and  eigenvalues  are: 


V3  | 


^6'  V5 


Se2t  -  6e3t 
lSe3t-l6e2 


o  18 


7.4.2  The  eigenvectors  and  eigenvalues  are: 


V2  „  ’V3  , 


7.4.3  The  eigenvectors  and  eigenvalues  are: 


-gVU 

4/2/3 


+*  -2 


/3/3  -/2/2  -/6/6  1 1 
a/3/3  a/2  /  2  — /6/6 

/3/3  0  5/2/3  _ 

'  a/3/3  -V2/2  -/6/6  ' 
/3/3  /2/2  -/6/6 

_  /3/3  0  5/2/3  _ 

'1  0  0 
=  0-20 
0  0-2 


-1  1  1 

1  -1  1 

1  1  -1 


7.4.4  The  eigenvectors  and  eigenvalues  are: 


4/2/ 3 


The  matrix  U  has  these  as  its  columns. 

7.4.5  The  eigenvectors  and  eigenvalues  are: 

{mui 

I  . 4/2/3 .  1 


-4/2 

4/2 

0 


++  18, 


++  12, 


-4v/2 

4/2 

0 


0  24 


<->  18. 


The  matrix  U  has  these  as  its  columns. 
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7.4.6  eigenvectors: 


kV5V6 


These  vectors  are  the  columns  of  U. 


-kV2V3 

^y/2y/l5 


o  —2, 


§v/5 

3q\/30 


o  -3 


7.4.7  The  eigenvectors  and  eigenvalues  are:  <  —\\H- 

U  ^ 

These  vectors  are  the  columns  of  the  matrix  U. 


7.4.8  The  eigenvectors  and  eigenvalues  are: 


These  vectors  are  the  columns  of  U. 


7.4.9  The  eigenvectors  and  eigenvalues  are: 


^y/2y/5 

^V5 


The  columns  are  these  vectors. 


i  r  ^V2V5 

0  ,  ±V3V5 

W2V 3  -W5 


7.4.10  The  eigenvectors  and  eigenvalues  are: 


-iV3 


kV2V3 


-W2 


The  columns  are  these  vectors. 


7.4.11  eigenvectors: 


-JV3 

\V2 


0 


0-2, 


Then  the  columns  of  U  are  these  vectors 
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7.4.12  The  eigenvectors  and  eigenvalues  are: 


f 

r  -tvs  i 

) 

l 

0 

.  IV5V6  . 

9 

r  i 


L  15 


i^5 

4WI5 


-1, 


2q"\/30 


e2. 


The  columns  of  U  are  these  vectors. 


L  6 


-IV6  $y/2y/3  JV6 

0  ±V5  -|V5 

V5V6  j^y/2y/l5 


30 


_  1 

2 


iov/5 


-W6V5  ±V5 


7 

5 

■iVS 


■iV6 


9_ 

10 


r  -jvs 

I 

'  -1 

0 

0  ' 

0 

1  , 
Vl|fO 

Si 

= 

0 

-1 

0 

^V30. 

0 

0 

2 

7.4.13  If  A  is  given  by  the  formula,  then 

At  =  UtDtU  =  UtDU  =A 

Next  suppose  A—  A7 .  Then  by  the  theorems  on  symmetric  matrices,  there  exists  an  orthogonal  matrix  U 
such  that 

UAUt  =  D 


for  D  diagonal.  Hence 


A  =  UlDU 


7.4.14  Since  A  ^  /l,  it  follows  X  •  Y  =  0. 


7.4.21  Using  the  QR  factorization,  we  have: 


'  1  2  1' 

r  JV6  ^V3  1 

V6  W6  g\/6 

2-10 

— |x/2  -^/3 

0  ly/2  ^V2 

1  3  0 

.  $V6  IV2  -\V3  . 

o 

i 

o 

s 

A  solution  is  then 


r|^i 

f  h)v/2  1 

r  1 

9 

-\sfl 

9 

-tW3 

V  \>n  \ 

L  -1V3  J 

'  1 

2 

1  ' 

r^> 

JV5V3 

ifrv^v^ 

iiT^ii  1 

2 

-1 

0 

-|V2^3 

2^2  \/3v/37 

-m^TiT 

1 

3 

0 

JV6 

^v/2^3 

—  333'^'*^ 

37  n/  1 1 1 

0 

1 

1 

0 

JV2V3 

333\/3\/37 

-ITT '/ITT  J 

7.4.22 
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\/6  \V6 

0  IV2V3  ^V2V3 

0  ’  0  ^\/3\/37 

0  0  0 

Then  a  solution  is  _ 

'  1  r  IV2V3  1  r  ifTv/3737  - 

\y/6  -IV2V3 
\V6  ’  ^V2V3  ’ 

0  J  L  IV2V3  \  [ 

7A.25 

a\  a^/2  as/2  x 

[  x  y  z  ]  r/4/2  a2  a^/2  y 

_  <75/2  a6/2  a3  \  \_  z 

7.4.26  The  quadratic  form  may  be  written  as 

xT  Ax 

where  A—AT.  By  the  theorem  about  diagonalizing  a  symmetric  matrix,  there  exists  an  orthogonal  matrix 
U  such  that 

UtAU  -  D,  A  -  UDUt 

Then  the  quadratic  form  is 

xtUDUtx  =  ( UTx ) T  D  (UTx) 

where  D  is  a  diagonal  matrix  having  the  real  eigenvalues  of  A  down  the  main  diagonal.  Now  simply  let 

—*f  T  jT 

X  —  U  X 

9.1.20  The  axioms  of  a  vector  space  all  hold  because  they  hold  for  a  vector  space.  The  only  thing  left  to 
verify  is  the  assertions  about  the  things  which  are  supposed  to  exist.  0  would  be  the  zero  function  which 
sends  everything  to  0.  This  is  an  additive  identity.  Now  if  /  is  a  function,  —f(x)  =  (—f(x)).  Then 

(/+(-/))  (*)  =  /(*)  +  (-/)  0)  =  /(*)  +  (-/(*))  =  ° 

Hence  /  H — /  =  0.  For  each  v  e  [a,b\ ,  let  fx  (x)  =  1  and  fx  (y)  =  0  if  y  ^  x.  Then  these  vectors  are 
obviously  linearly  independent. 

9.1.21  Let  /(/)  be  the  ith  component  of  a  vector  x  G  R".  Thus  a  typical  element  in  R”  is  (/(l) ,  •  •  • ,/(«)). 

9.1.22  This  is  just  a  subspace  of  the  vector  space  of  functions  because  it  is  closed  with  respect  to  vector 
addition  and  scalar  multiplication.  Hence  this  is  a  vector  space. 

9.2.1  £f=1 0xk  =  0 

9.3.29  Yes.  If  not,  there  would  exist  a  vector  not  in  the  span.  But  then  you  could  add  in  this  vector  and 
obtain  a  linearly  independent  set  of  vectors  with  more  vectors  than  a  basis. 
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9.3.30  No.  They  can’t  be. 

9.3.31  (a) 

(b)  Suppose 

ci  (x3  +  l)  +C2  (x2  +  x)  +C3  (2x3  +  x2)  +  c4  (2x3  —x2  —  3x+  l)  =  0 
Then  combine  the  terms  according  to  power  of  x. 

(ci  +  2c3  +  2c4)  x3  +  (C2  +  C3  -  C4)  x2  +  (C2  -  3c4)  x  T  (ci  T  c4)  =  0 

ci  T  2c3  T  2c4  =  0 

T  i  i  i  C2  T  C3  —  c4  =  0  ni¬ 
ls  there  a  non  zero  solution  to  the  system  ,,  „  ,  Solution  is: 

C2  —  3c4  =  0 

ci  +c4  =  0 

[Cl  =  0,C2  =  0,c3  =  0,c4  =  0] 

Therefore,  these  are  linearly  independent. 

9.3.32  Let  p,(x)  denote  the  ith  of  these  polynomials.  Suppose  £,C,-p,-(x)  =  0.  Then  collecting  terms 
according  to  the  exponent  of  x,  you  need  to  have 

C\Q\  T  C2CI2  T  ^”36(3  T  C4£?4  =  0 
Ci&i+C262  +  C3£3+C4Z>4  —  0 
Ci  ci  +  C2C2  +  C3C3  +  C4C4  —  0 
Ci  cl\  T  C2C?2  T  C-3^3  T  C4r/4  —  0 

The  matrix  of  coefficients  is  just  the  transpose  of  the  above  matrix.  There  exists  a  non  trivial  solution  if 
and  only  if  the  determinant  of  this  matrix  equals  0. 


9.3.33  When  you  add  two  of  these  you  get  one  and  when  you  multiply  one  of  these  by  a  scalar,  you  get 
another  one.  A  basis  is  j  1,  \/2  j .  By  definition,  the  span  of  these  gives  the  collection  of  vectors.  Are  they 

independent?  Say  a  +  by/ 2  =  0  where  a,b  are  rational  numbers.  If  a  /  0,  then  by/l  =  —a  which  can’t 
happen  since  a  is  rational.  If  b  ^  0,  then  —a  —  by/ 2  which  again  can't  happen  because  on  the  left  is  a 
rational  number  and  on  the  right  is  an  irrational.  Hence  both  a,b  —  0  and  so  this  is  a  basis. 

9.3.34  This  is  obvious  because  when  you  add  two  of  these  you  get  one  and  when  you  multiply  one  of 
these  by  a  scalar,  you  get  another  one.  A  basis  is  jl,\/2j.  By  definition,  the  span  of  these  gives  the 

collection  of  vectors.  Are  they  independent?  Say  a  +  by/l  =  0  where  a,b  are  rational  numbers.  If  a  /  0, 
then  by/l  =  —a  which  can't  happen  since  a  is  rational.  If  b  ^  0,  then  —a  =  by/l  which  again  can’t  happen 
because  on  the  left  is  a  rational  number  and  on  the  right  is  an  irrational.  Hence  both  a,b  =  0  and  so  this  is 
a  basis. 


'  1  ' 

'  1  ' 

9.4.1  This  is  not  a  subspace. 

1 

1 

is  in  it,  but  20 

1 

1 

1 

1 
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9.4.2  This  is  not  a  subspace. 


9.6.1  By  linearity  we  have  T(x2)  —  l ,T(x)  =  T(x2+x—x2)  =  T(x2+x)  —  T(x2) 
T(x2  +x+  1  —  (x2  +x))  —  T(x2  +x+  1)  —  T(x2  +x))  —  —  1  —  5  =  —6. 


Thus  T(ax2  +  bx  +  c)  —  aT  (x2)  +  bT  (x)  +  cT  ( 1)  —  a  +  5b  —  6c. 
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5,  and  T(l)  — 


9.7.1  If  YJi aiVr  —  0,  then  using  linearity  properties  of  T  we  get 

0  =  T(0  )  =  r(£a«vr)  =  £a(-r(vr). 
i  i 

Since  we  assume  that  {Tv i,  •  •  • , Tv,}  is  linearly  independent,  we  must  have  all  a,  =  0,  and  therefore  we 
conclude  that  {vi,  •  •  • , vr}  is  also  linearly  independent. 

9.7.3  Since  the  third  vector  is  a  linear  combinations  of  the  first  two,  then  the  image  of  the  third  vector  will 
also  be  a  linear  combinations  of  the  image  of  the  first  two.  However  the  image  of  the  first  two  vectors  are 
linearly  independent  (check!),  and  hence  form  a  basis  of  the  image. 

Thus  a  basis  for  im(T)  is: 


9.8.1  In  this  case  dim(W)  =  1  and  a  basis  for  W  consisting  of  vectors  in  S  can  be  obtained  by  taking  any 
(nonzero)  vector  from  S. 


9.8.2  A  basis  for  ker  (T)  is 


and  a  basis  for  im  (T)  is 


There  are  many  other  possibilities  for  the  specific  bases,  but  in  this  case  dim  (ker  (T))  —  1  and  dim  (im  ( 7") )  = 

1. 

9.8.3  In  this  case  ker(T)  =  {0}  and  im(T)  =  M2  (pick  any  basis  of  M2). 

9.8.4  There  are  many  possible  such  extensions,  one  is  (how  do  we  know?): 
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9.8.5  We  can  easily  see  that  dim(im(r))  =  1,  and  thus  dim(ker(r))  =  3  —  dim(im(r))  =  3  —  1  =  2. 

9.9.1  (a)  The  matrix  of  T  is  the  elementary  matrix  which  multiplies  the  jth  diagonal  entry  of  the  identity 

matrix  by  b. 

(b)  The  matrix  of  T  is  the  elementary  matrix  which  takes  b  times  the  jth  row  and  adds  to  the  ith  row. 

(c)  The  matrix  of  T  is  the  elementary  matrix  which  switches  the  ith  and  the  jth  rows  where  the  two 
components  are  in  the  ith  and  jth  positions. 


9.9.2  Suppose 


=  [  ai  •••  any[ 


Thus  cfci  j  =  8jj.  Therefore, 


l 

[b\  •••  bn][a\  •••  an]~  a{  =  [bi  bn  ]  \  at 

->T 

l  J 

\  b\  •  ■  ■  bn  j  C; 

=  bi 

Thus  Tdi  =  [bi  bn][ai  •••  an  ]  1  a,-  =Aa,.  If  x  is  arbitrary,  then  since  the  matrix  [  ci\ 
is  invertible,  there  exists  a  unique  y  such  that  [  a\  ■  ■  ■  an  ]  y  —  x  Hence 
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9.9.7 
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9.9.11  Recall  that  proj-  (v)  =  and  so  the  desired  matrix  has  ilh  column  equal  to  proj-  (c,) .  Therefore, 
the  matrix  desired  is 
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cross  product,  183,  184 

area  of  parallelogram,  186 
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adjugate,  130 

geometric  description,  183,  184 
cylindrical  coordinates,  449 

back  substitution,  13 
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basic  solution,  29 
basis,  203,  204,  234,  487 
any  two  same  size,  489 
box  product,  188 

De  Moivre’s  theorem,  339 
determinant,  107 
cofactor,  109 

expanding  along  row  or  column,  110 
matrix  inverse  formula,  131 
minor,  108 

cardioid,  443 

Cauchy  Schwarz  inequality,  168 
characteristic  equation,  349 
chemical  reactions 
balancing,  33 

Cholesky  factorization 
positive  definite,  420 
classical  adjoint,  130 

Cofactor  Expansion,  110 
cofactor  matrix,  130 
column  space,  211 
complex  eigenvalues,  369 
complex  numbers 

absolute  value,  334 
addition,  329 
argument,  336 
conjugate,  331 
conjugate  of  a  product,  336 
modulus,  334,  336 
multiplication,  330 
polar  form,  336 
roots,  339 
standard  form,  329 
triangle  inequality,  334 
component  form,  162 
component  of  a  force,  265,  267 
consistent  system,  8 

product,  116,  121 
row  operations,  114 
diagonalizable,  363,  364,  407 
dimension,  205 

dimension  of  vector  space,  490 
direction  vector,  162 
distance  formula,  156 
properties,  157 
dot  product,  167 
properties,  167 

eigenvalue,  349 
eigenvalues 

calculating,  350 
eigenvector,  349 
eigenvectors 

calculating,  350 
elementary  matrix,  79 
inverse,  82 

elementary  operations,  9 
elementary  row  operations,  15 
empty  set,  537 
equivalence  relation,  362 
exchange  theorem,  204,  487 
extending  a  basis,  209 

field  axioms,  330 
finite  dimensional,  490 
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force,  261 

Fundamental  Theorem  of  Algebra,  329 

Gauss-Jordan  Elimination,  24 
Gaussian  Elimination,  24 
general  solution,  324 
solution  space,  323 

hyper-planes,  6 

idempotent,  93 

identity  transformation,  271,  500 
improper  subspace,  486 
included  angle,  170 
inconsistent  system,  8 
induction  hypothesis,  540 
injection,  293 
inner  product,  167 
intersection,  537 
intervals 

notation,  538 
invertible  matrices 
isomorphism,  512 
isomorphic,  298,  509 

equivalence  relation,  301 
isomorphism,  298,  509 
bases,  301 
composition,  300 
equivalence,  302,  513 
inverse,  300 
invertible  matrices,  512 

kernel,  215,  322 
Kirchhoff’s  law,  38 
Kronecker  symbol,  71,  239 

Laplace  expansion,  110 
leading  entry,  16 
least  square  approximation,  25 1 
linear  combination,  30,  151 
linear  dependence,  194 
linear  independence,  195 

enlarging  to  form  a  basis,  492 
linear  independent,  233 
linear  map,  298 

defining  on  a  basis,  304 
image,  310 
kernel,  310 


linear  transformation,  270,  298,  500 
composite,  283 
image,  292 
matrix,  273 
range,  292 

linearly  dependent,  475 
linearly  independent,  475 
lines 

parametric  equation,  164 
symmetric  form,  164 
vector  equation,  162 
LU  decomposition 
non  existence,  99 
LU  factorization,  99 
by  inspection,  100 
justification,  102 
solving  systems,  101 

main  diagonal,  112 
Markov  matrices,  378 
mathematical  induction,  540 
matrix,  14,  53 
addition,  55 

augmented  matrix,  14,  15 

coefficient  matrix,  14 

column  space,  211 

commutative,  67 

components  of  a  matrix,  54 

conformable,  62 

diagonal  matrix,  363 

dimension,  14 

entries  of  a  matrix,  54 

equality,  54 

equivalent,  27 

finding  the  inverse,  74 

identity,  7 1 

improper,  240 

inverse,  7 1 

invertible,  7 1 

kernel,  215 

lower  triangular,  112 

main  diagonal,  363 

null  space,  215 

orthogonal,  128,  238 

orthogonally  diagonalizable,  375 

proper,  240 
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properties  of  addition,  56 
properties  of  scalar  multiplication,  58 
properties  of  transpose,  69 
raising  to  a  power,  373 
rank,  3 1 
row  space,  211 
scalar  multiplication,  57 
skew  symmetric,  70 
square,  54 
symmetric,  70 
transpose,  69 
upper  triangular,  112 
matrix  exponential,  392 
matrix  form  AX=B,  61 
matrix  multiplication,  62 
ijth  entry,  65 
properties,  68 
vectors,  60 
migration  matrix,  378 
multiplicity,  350 
multipliers,  104 

Newton,  261 
nilpotent,  128 
non  defective,  407 
nontrivial  solution,  28 
null  space,  215,  322 
nullity,  219,  312 

one  to  one,  292 

linear  independence,  301 
onto,  292 
orthogonal,  235 
orthogonal  complement,  246 
orthogonality  and  minimization,  25 1 
orthogonally  diagonalizable,  407 
orthonormal,  235 

parallelepiped,  188 
volume,  188 
parameter,  23 
particular  solution,  322 
permutation  matrices,  79 
pivot  column,  18 
pivot  position,  1 8 
plane 

normal  vector,  179 


scalar  equation,  1 80 
vector  equation,  179 
polar  coordinates,  439 
polynomials 

factoring,  341 
position  vector,  146 
positive  definite,  417 

Cholesky  factorization,  420 
invertible,  417 
principal  submatrices,  419 
principal  axes,  404,  429 
principal  axis 

quadratic  forms,  427 
principal  submatrices,  418 
positive  definite,  419 
proper  subspace,  201,  486 

QR  factorization,  422 
quadratic  form,  427 
quadratic  formula,  343 

random  walk,  380 

range  of  matrix  transformation,  292 

rank,  312 

rank  added  to  nullity,  312 
reflection 

across  a  given  vector,  291 
regression  line,  254 
resultant,  262 
right  handed  system,  183 
row  operations,  15,  114 
row  space,  211 

scalar,  8 

scalar  product,  167 
scalar  transformation,  500 
scalars,  150 
scaling  factor,  425 
set  notation,  537 
similar  matrices,  362 
similar  matrix,  356 
singular  values,  409 
skew  lines,  6 
solution  space,  322 
span,  192,  233,  472 
spanning  set,  472 
basis,  208 
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spectrum,  349 
speed,  263 

spherical  coordinates,  450 
standard  basis,  204 
state  vector,  378 
subset,  192,  471 
subspace,  202,  484 


vector  space  axioms,  455 
vectors,  58 


basis,  204,  234 

column,  58 

linear  dependent,  194 

linear  independent,  195,  233 

orthogonal,  235 

orthonormal,  235 

row  vector,  58 


basis,  204,  234 
dimension,  205 
has  a  basis,  491 


span,  192,  233 
velocity,  263 


span,  203 

summation  notation,  539 
surjection,  293 
symmetric  matrix,  401 
system  of  equations,  8 


zero  matrix,  54 
zero  subspace,  201 
zero  transformation,  271,  500 
zero  vector,  149 


well  ordered,  540 
work,  265 


homogeneous,  8 
matrix  form,  61 
solution  set,  8 


vector  form,  59 

trace  of  a  matrix,  363 
triangle  inequality,  169 
trigonometry 

sum  of  two  angles,  289 
trivial  solution,  28 

union,  537 
unit  vector,  158 

variable 
basic,  25 
free,  25 
vector,  147 

addition,  148 

addition,  geometric  meaning,  147 
components,  147 
coordinate  vector,  316 
corresponding  unit  vector,  158 
length,  158 
orthogonal,  172 
perpendicular,  172,  173 
points  and  vectors,  146 
projection,  174 
scalar  multiplication,  150 
subtraction,  149 
vector  space,  455 
dimension,  490 
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